MENU

Executive Series: What Is Unstructured Data?

June 1, 2016 Unstructured Data, Big Data

What is unstructured data?

Unstructured data is a catch-all term used to describe free-form information—text, images, audio, videos—that is not organized inside a well-defined storage structure, such as a relational database management system or a financial application. Unstructured data comes in many forms.

How is it different from structured data?

Structured data has a predefined data model, with well understood hierarchies and relationships among data, well established validation rules, and disciplined management. Structured data is typically centralized, cohesive, and consistent. It is optimized for business intelligence—search, retrieval, and analytics—but it is limited in scope.

Unstructured data, by contrast, grows organically, inconsistently and is ungoverned by data model constraints. It propagates sporadically, in many formats, across many different devices, with no predictable regularity, quality or completeness.

All data falls along a continuum between structured and unstructured data.

Why should I pay attention to unstructured data?

Unstructured data is growing at a much faster rate than structured data—some say it represents 80 percent of all new data. Increasingly, your unstructured data contains much of the crucial information you’ll need to run your organization.

However, existing legacy database systems won’t help you find it because it is stored in various places. Critical information that is buried and dispersed in random unstructured data may be all but invisible to business intelligence searches. Suppose you wanted to analyze and fix a major product design problem reported by your biggest customer. What existing database system would allow you to manage all of the information you need for your analysis, including design specifications, mechanical drawings, test results, emails, presentations, proposals, production plans, meeting notes, contracts and phone logs?

Dark data is operational data that your organization collects and saves but never analyzes. Matt Aslett of 451 Research defines it as “data that was previously ignored because of technology limitations.” Most likely, you have substantial dark but potentially useful data stored in business documents, messages, engineering drawings and so on.

Cisco Systems analyzed Wi-Fi router log files at Copenhagen Airport in order to identify patterns of passenger behavior. While moving through the airport, passenger smart phones pinged the various routers, even if they didn’t actually log onto the network. This time-stamped, location-based data provided useful insights into passenger movements, bottlenecks and popular locations such as areas within shops. This allowed Cisco to optimize the size and locations of the Wi-Fi Wireless Access Points.

Where can I learn more about unstructured data?

We recommend these resources for a deeper dive into unstructured data:

Who is Peaxy? Why are you telling me this?

Peaxy helps enterprises keep up with the challenge of storing, retrieving, managing and curating exponentially growing unstructured data assets. The Peaxy AureumTM data access platform aggregates, organizes and manages massive amounts of unstructured data without disrupting business operations.


Peaxy Executive Series is designed to explain quickly and simply what business leaders need to know about data access and aggregation.