[Intro to Hadoop and MapReduce] Lesson 1 Big data
You can read more about Big Data in Wikipedia which is also a company that generates and processes huge amounts of data itself.
2. Data Sources
According to IBM: “Every day, 2.5 billion gigabytes of high-velocity data are created in a variety of forms, such as social media posts, information gathered in sensors and medical devices, videos and transaction records”
3. Quiz: Big Data
What is BIG DATA?
4. Definition of Big Data
A resonable definition of big data might be, It’s data that’s too big to be processed on a single machine.
Big Data is a loosely defined term used to describe data sets so large and complex that they become awkward to work with using standard statistical software. (International Journal of Internet Science, 2012, 7 (1), 1–5)
5. Quiz: Challenges
Challenges with big data
6. The 3 Vs - Volume
The 3 V’s were first defined in a research report by Douglas Laney in 2001 titled “3D Data Management: Controlling Data Volume, Velocity and Variety”.
In 2012 he updated the definition as follows “Big data is high volume, high velocity, and/or high variety information assets that require new forms of processing to enable enhanced decision making, insight discovery and process optimization”.
7. Quiz: Worthwhile Data
Data Worth Storing?
The problem is that to store data in systems like that(traditional database), the data needs to be able to fit in pre-defined tables. And a lot of data that we deal with these days, tends to be what we call unstructured or semi-sturctured data.
9. Data Formats
Nice thins about Hadoop is that it doesn’t care what format your data comes in. Unlike a traditional database, you can store the data in its raw format and manipulate it and reformat it later.
10. Quiz: Using Variety
12. Quiz: Your Interests
What data intrests you? > Survey question. no right.
13. Doug Intro
14. Doug Cutting: The Origins of Hadoop
Doug Cutting, Creator of Hadoop
15. Hadoop Logo Intro
16. Doug Cutting: The Name of Hadoop
Came from his son’s toy.
17. Core Hadoop
Cloudera provides free download of Chapter 2 of Tom White’s essential text, Hadoop: The Definitive Guide.
18. Hadoop Ecosystem
See more in the free Chapter 2 of Tom White’s essential text, Hadoop: The Definitive Guide