BIG DATA BY SAIKIRAN PANJALA

Post on 21-Feb-2017

25 views 4 download

Transcript of BIG DATA BY SAIKIRAN PANJALA

BIG DATAPresenting By

XXXXXX12XXXXX

Under the guidance ofXXXX

JAVA CARD 2

Big data is a broad term for data sets so large or complex that traditional data processing applications are inadequate. Challenges include analysis, capture, data curtain, search, sharing, storage, transfer, visualization, and information privacy. The term often refers simply to the use of predictive analytics or other certain advanced methods to extract value from data, and seldom to a particular size of data set. Accuracy in big data may lead to more confident decision making. And better decisions can mean greater operational efficiency, cost reductions and reduced risk.

ABSTRACT

3

No single standard definition…

“Big Data” is data whose scale, diversity, and complexity require new architecture, techniques, algorithms, and analytics to manage it and extract value and hidden knowledge from it…

JAVA CARD

Big Data Definition

What is data?Types of data1. Relational Data 2. Text Data 3. Semi-structured Data 4. Graph Data5. Streaming Data

Data

How much data daily using?

640K ought to be enough for anybody.

Lots of data is being collected and warehoused ◦ Web data, e-commerce◦ purchases at department/

grocery stores◦ Bank/Credit Card

transactions◦ Social Network

Big Data Every Where!

Maximilien Brice, © CERN

The Model of Generating/Consuming Data has Changed

The Model Has Changed…

Old Model: Few companies are generating data, all others are consuming data

New Model: all of us are generating data, and all of us are consuming data

Who’s Generating Big Data ?

Social media and networks

Scientific instruments

Mobile devices Sensor

technology and

networks

How much data using?

The Meaning of Big Data - 3 V’s

•Big Volume

•Big Velocity

•Big Variety

Data Volume◦ 44x increase from 2009 2020◦ From 0.8 zettabytes to 35zb

Data volume is increasing exponentially

Characteristics of Big Data: 1-Scale (Volume)

Consider closing price on all trading days for the last 5 years for two stocks A and B

What is the covariance between the two time-series?

(1/N) * sum (Ai - mean(A)) * (Bi - mean (B))

Big Data - AnalyticsAn Example

Ignoring the (1/N) and subtracting off the means ….

Stock * StockT

Now try it for companies headquartered in Charlotte!

Array Answer

Trading volume on Wall Street going through the roof

Breaking all their infrastructure

And it will just get worse

Big Velocity

Data is begin generated fast and need to be processed fast

Online Data Analytics Late decisions

Examples:◦ E-Promotions◦ Healthcare monitoring

Characteristics of Big Data: Speed (Velocity)

There are three forms of variety:1. Structured2. Semi structured3. unstructured

Big variety

enterprise text data warehouse

The World of Data Integration

the rest of your data

Some Make it 4V’s

What is hadoop? Apache hadoop is a framework that allows for

the distributed processing of large data It is an open-source data management

hadoop

Hadoop key characteristics

Hadoop Eco-System

Advantages of big dataDisadvantages of dig data

The biggest challenge for any big application

Choose wisely and move forward otherwise it cant get the value of data

CONCLUSION

http://www.edureka.in/blog/the-hype-behind-big-data/

http://en.wikipedia.org/wiki/big-data

REFERENCES