Structured Vs Unstructured data
Structured data is found in fixed field within a field or a file in a relational databases and spreadsheet. It is easier to store, update and query the structured data fields. Data was usually structured for a long time. Work was done on the data to make it structured before storing it. Especially business analyst felt they could manipulate only atomic data. For some time, data that was not in the structured format was not stored and analyzed.
Unstructured data refers to the data that does not have any predefined structure. It could be a free form text field that consists of dates, numbers and various other important data points. A good example of unstructured data is social media sites like Twitter and Facebook. Patient data stored in a hospital database is another good example of unstructured data because this data contains free form text fields describing the ailments and also lab reports.
It can be seen that the structured data is well prepared to be stored in specific formats in the specified location like say a database and corporations know very well how to handle them. But unstructured data is too large to be fit into rows and columns. Converting unstructured data into structured data is very difficult and unnecessary because the data will lose its significance in the conversion. Also, ideally, structured data should be simplistic in nature.
Different Data Types and their Warehousing:
Today, every organization wants to save the big unstructured data along with the structured data. By 2012, most of the organizations had started to incorporate big data. The below figure shows the growth of different types of data in an organization as of 2012.
The growth of unstructured data was on the rise and it is still rising with the break through in technologies like Hadoop, hive to manage the big data and the rise of no SQL databases like MongoDB. With the advent of Cloud computing, the size of big data is no more an issue as it used to be. In spite of all this, there is a recent survey that suggests that many database specialists struggle with no sql database and many more want one platform to handle both structured and unstructured data.
Data warehousing is the science of making sense of different data to better answer the questions about the business. To make sense of a perfectly structured data using Data warehousing techniques is a challenge faced by many organizations and so to warehouse big unstructured data is a challenge in itself.
Data warehousing categorically determines the kind of analysis that is possible with the data at a time when the data enters into the system. This is a good technique for unchanging, atomic structured data. But when it concerns the dynamic big data from social media, this technique fails. Many companies are trying to come up with solutions to help traditional data warehousing systems to deal with the uncertainties of the big unstructured data.
Data Warehousing in the Age of Big Data
Business Intelligence/Data Warehousing has provided a novel means to assimilate data into the business processes and help visualize the data. This helps the senior management to understand the business better. This is very different from big data. Big data is a means to store very large unstructured data.
Since big data poses challenges to the traditional data warehousing systems, the big data vendors like hadoop have come up with a hybrid transaction processing system. The difference between the 2 is shown below:
Though the new method offers real time analysis, the traditional DW/BI is not going anywhere. Most of the organizations already use DW tools and will continue to use them. Also the visualization offered by the BI tools is very helpful for the senior management to make informed decisions from the data. Though big data is new and dynamic, it has many gaps and can never be completely relied upon to make decisions. DW/BI tools will continue to work on the data and help businesses prosper.
http://www.theregister.co.uk/2012/10/08/big_data_revolution/
http://www.webopedia.com/TERM/S/structured_data.html
http://en.wikipedia.org/wiki/Unstructured_data
http://www.kpipartners.com/blog/bid/137981/Structured-Data-vs-Unstructured-Data
http://www.theregister.co.uk/2012/10/08/big_data_revolution/
http://www.stocknewsnow.com/newsrss/1986581-Technology
http://www.onapproach.com/7-challenges-consider-building-data-warehouse/
http://searchbusinessanalytics.techtarget.com/feature/Big-data-vendors-should-stop-dissing-data-warehouse-systems
http://www.kdnuggets.com/2014/06/data-lakes-vs-data-warehouses.html
http://www.infoworld.com/article/2607810/cloud-computing/the-cloud-and-big-data-are-no-threat-to-data-warehouses.html
http://timoelliott.com/blog/2014/04/no-hadoop-isnt-going-to-replace-your-data-warehouse.html