Bigdata

20
BIG DATA Sourabh Dattawad Department of Computer Science KLS Gogte Institute of Technology Belgaum, India

Transcript of Bigdata

Page 1: Bigdata

BIG DATA

Sourabh Dattawad Department of Computer ScienceKLS Gogte Institute of Technology

Belgaum, India

Page 2: Bigdata

Contents

• Introduction• What is Big Data?• Characteristics of Big Data• What is Big Data analytics?• How does Big Data work?• Application of Big Data• Big Data growth• What’s trending• Conclusion• References

Page 3: Bigdata

Introduction• A decade ago amount of data produced was less.• Today the amount of data in the world is increasing

rapidly, outstripping not only our machines, but also our imagination.

Page 4: Bigdata

What can be done with this data?

• Scrapping this data is not a great idea. • Big data has the potential to help companies improve

operations and make faster, more intelligent and accurate decisions.

• More accurate analyses will lead to more confident and effective decision making. And better decisions can mean cost reductions and reduced risk.

Page 5: Bigdata

Definition

• Big Data is a new term given to a diverse field of data analysis in which the datasets are so massive that they become hard to store, work, predict and analyze using traditional databases and software.

Page 6: Bigdata

Characteristics of Big Data• Big Data is characterized as follows,

Page 7: Bigdata

Volume

• It is the quantity of data generated that determines the value and potential of data .

• Facebook, gets more than 12 million photos every hour .

• Tweets on twitter cross over 400 million every day.

Page 8: Bigdata

Velocity

• Its states the rate at which data is generated. • Every minute on YouTube 48 hours of new videos are

uploaded.• Every minute Google processes 2 million search

queries.

Page 9: Bigdata

Variety

• It is the category to which the data belongs.• The categories include Health sectors, Social

networking, Banking etc.

Page 10: Bigdata

What is Big Data analytics?

• Analyzing the large data and reaching to conclusions is called as Big Data analytics .

• Explanation using real life incidents,– Google’s Flu Trends.– Target Retailer.

Page 11: Bigdata

Google’s Flu Trends

• Here Google predicted the flu trends just by analyzing the data.

• In the year 2009 a new flu virus ‘H1N1’ was discovered. • 250-500k deaths every year, worldwide.• Swine flu pandemic is worse.• Surveillance Centers for Disease Control and Prevention (CDC).

Problems Faced by CDC,– Weekly– 1-2 week publication lag

Page 12: Bigdata

• Google took 50 million common search terms that was typed in United States and compared the number with CDC data on the spread of the flu.

• They processed 450 million different models in order to test the search terms and prediction was almost similar the stats processed by CDC .

What did they do?

Page 13: Bigdata

Target Retailer

• Target retailer predicted the pregnancy just by analyzing the buy trends of the consumers.

• Story of a pregnant teenager.• This shows that real time data is never false.

Page 14: Bigdata

How Big Data Works?

• Apache Hadoop -Apache Hadoop is the software most commonly associated with Big Data. Apache states it as “a framework that allows us for the distributed processing of massive data sets across clusters of computers using simple programming models”.

• With Hadoop, no data is too big. It is possible to process a huge data in just 3 minutes which takes more than 20 hours for traditional systems.

Page 15: Bigdata

• MapReduce - To make effective splitting of data MapReduce is used. It is a software framework that allows primary to split the input data set into independent chunks that are processed in a completely parallel manner.

Simple Block Diagram

Page 16: Bigdata

Applications of Big Data

Page 17: Bigdata

Big Data Growth

Page 18: Bigdata

What’s trending

• By analyzing the Big Data of DNA it is possible cure genetic diseases like cancer.

• This can even predict where terrorists try to attack only by analyzing the data.

Page 19: Bigdata

Conclusion

• Big Data is the next big thing. Its about letting data speak and real time data is never false, hence it is a revolution that will transform how we think, live and work.

Page 20: Bigdata

References

• Victor Mayer-Schonberger, Kenneth Cukier “Big Data – A Revolution”.

• Doing Data Science, By Cathy O'Neil, Rachel Schutt Publisher: O'Reilly Media.

• http://hadoop.apache.org

Thank You