What impact does the scale of large data have?
Data has become capital and leverage for companies that collect it at scale. The amount of data that gets harvested and stored into a database makes it difficult to understand without a proper data analytics approach . The larger the data becomes the harder it is to build an algorithm that can mitigate errors and present a coherent chart or search result that can present a useful analysis.
Is it enough to collect data?
The action of collecting data by itself can be sufficient if you are only attempting to store information and displaying it back to the user once they demand. However, if you are trying to build an intelligent result out of it for example if you are performing an internet search result you will need to perform analysis on the data using a formula. An idea of a formula can be linear regression which can parse the data by filtering the margin of error out of it. Data mining has emerged as a solution that can resolve the issue of completeness to build a pattern and optimization to filter the good data out of the dataset .
What and how can we learn from data?
We can learn from data a trend or discover a problem. When we build a formula that can analyze a large amount of data, the formula should be able to present the data to us in a fashion that outlines irregularities but also most importantly trends.
How can we turn data into useful insights?
Charting data can present a visual image that can help simplify millions of data points into a human friendly image. Grouping similar instances by normalizing the relation of data in a table that has an X,Y representation can draw a line chart that could present the behavior of a trend . For example, if we are tracking the price of rice on the open markets for 10 years, we might have price fluctuation based on (dates, price as X,Y) the line normalization in the chart cleaning error margins from market fluctuation can possibly bring forth huge market movers that occurred in the 10 years for example a war, a drought or a disease.
1. Silberschatz, A., Korth, H.F. and Sudarshan, S., 2019. Database system concepts. New York: McGraw-Hill.
2. Han, J., Kamber, M. and Pei, J., 2011. Data mining: Concepts and techniques. Amsterdam: Elsevier.