Understanding Big Data Through Smartphones and Beyond
Data Generation By Smartphones
- Daily activities generate massive data: texts, calls, emails, photos, videos, searches, and music.
- ~40 exabytes of data/month by a single user.
- Multiplied by 5 billion users = unimaginable data volume.
- This data overload is termed 'Big Data'.
Internet Activity Data Per Minute
- Snapchat: 2.1 million snaps shared
- Google: 3.8 million search queries
- Facebook: 1 million logins
- YouTube: 4.5 million videos watched
- Emails: 188 million sent
Classification of Big Data: The Five V's
- Volume: Massive amounts of data (e.g., healthcare records - 2314 exabytes annually).
- Velocity: Speed at which data is generated and processed.
- Variety: Different data types - structured, semi-structured, unstructured (e.g., excel files, log files, x-ray images).
- Veracity: Accuracy and trustworthiness of data.
- Value: Benefits derived from analyzing the data (e.g., faster disease detection, better treatment in healthcare).
Storage and Processing of Big Data
- Frameworks: Cassandra, Hadoop, Spark.
- Hadoop Example:
- Storage: Uses Hadoop Distributed File System (HDFS).
- Breaks down large files into smaller chunks and stores them across multiple machines.
- Keeps copies of breaks to ensure data safety.
- Processing: Uses MapReduce.
- Splits a task into smaller tasks and processes them in parallel on different machines (parallel processing).
Applications of Big Data Analytics
- Gaming: Insights into user behavior for improving game design (e.g., Halo 3, Call of Duty).
- Disaster Management: Improved predictions and responses (e.g., Hurricane Sandy).
Quiz Question
Which statement is not correct about Hadoop Distributed File System (HDFS)?
A. HDFS is the storage layer of Hadoop.
B. Data gets stored in a distributed manner in HDFS.
C. HDFS performs parallel processing of data.
D. Smaller chunks of data are stored on multiple data nodes in HDFS.
Conclusion and Engagement
- Potential future impact of big data?
- Call to action: Like, share, subscribe, and comment.
- Follow-up with questions or thoughts in the comments.