Data scientists are often considered to be wizards that deliver value from big data. These wizards need to have knowledge in three very distinct subject areas, namely, scalable data management (e.g., data warehousing, Hadoop, parallel processing, query processing, SQL, and storage & resource management), data analysis (e.g., advanced statistics, linear algebra, optimization, machine learning, and mathematics) and domain area expertise (e.g., engineering, logistics, medicine, or physics).
In this blog post, I will discuss some economic, societal, and legal issues concerning big data. One often hears that ”data is the new oil.” Like oil, data is a complex product derived from numerous processing and refinement steps and an entire economic ecosystem involving drilling stations, refineries and distribution networks, which include filling/gas stations. Similarly, one can draw an analogy for the big data realm.
Today, analysts seek to derive insight from large, heterogeneous, high-velocity (i.e., big) data sets using varying data analysis methods. These data sets are ubiquitous. They arise due to burgeoning cloud computing services, the anticipated Internet of Services (IoS), and the emerging Internet of Things (IoT). Big data is often defined as any data set that cannot be handled using today’s widely available mainstream solutions, techniques, and technologies.