Post on 22-Jan-2018
BIG DATAEMERGING JUGGERNAUT IN
IT INDUSTRY
JENNIFER BOOMERSHINE
RAJAT RAJAT
SCOTT SPRADLING
IDEAN TAGHEIZADEH
ROBERT HURLEY
BIG DATA
BIG DATA DESCRIBES THE APPLICATION OF NEW TOOLS AND
TECHNIQUES TO DIGITAL INFORMATION ON A SIZE AND SCALE WELL
BEYOND WHAT WAS POSSIBLE WITH TRADITIONAL APPROACHES,
TYPICALLY INVOLVING DATA SETS THAT ARE SO LARGE AND
COMPLEX THAT THEY REQUIRE ADVANCED DATA STORAGE,
MANAGEMENT, ANALYTICS, AND VISUALIZATION TECHNOLOGIES.
Abstracts of Papers, 250th ACS National Meeting & Exposition, Boston, MA, United
States, August 16-20, 2015 (2015), ANYL-28.
• Airlines: 10 TB every 30 minutes 640 TB of data per flight
• Internet User: By the end of 2015 internet traffic will exceed 4.8 ZB per year.
• Emails: 300 billion emails are sent every day.
• Facebook: 25 TB of data daily.
• Twitter: 12 TB of data daily. About 97000 tweets are sent every second.
• YouTube: 2.9 billion video hours are watched on YouTube per month.
• Trading: NYSE produces 1 TB of data per trading day.
• Experiments: atomic particles 40 TB per second.
FACTS
THREE V’S
10% STRUCTURED
90% UNSTRUCTURED
Big Data
Velocity
Real Time
Batches
Data Streams
Variety
Structured
Unstructured
Weakly correlated
Volume
Tables
Petabytes
Transactions
HOW MUCH?
DC's Digital Universe Study, sponsored by EMC Corporation , December
2012 BioTechnology: An Indian Journal (2014), 10(15, Pt.. 4), 8811-8816.
50 fold increase in data volume0.7 ZetaByte in 2009
35 ZetaByte in 2020
Unstructured Data
Structured Data
“Smart Data”
IN ORDER TO HANDLE BIG DATA, WE NEED SMART DATA
PROCESS
Abstracts of Papers, 250th ACS National Meeting & Exposition, Boston, MA, United
States, August 16-20, 2015 (2015), ANYL-28.
CRITICAL SOLUTION
OPEN SOURCE FRAMEWORK WITH MASSIVE PARALLEL PROCESSING
• APACHE HADOOP
• PYTHON
• MAPREDUCE
• GOOGLE FILE SYSTEM
• SAAS
Master node manages file location & failuresSpecific tasks are assigned to individual nodes
Scale: 1 MB to 1GB same protocol
Scaling is linear: increase processing by increase # of computers
COMPETITIVE EDGE
COMPANIES WILL GAIN COMPETITIVE EDGE BY:
• COLLECTING, ANALYZING, & UNDERSTANDING INFORMATION
• REDUCE MARKETING TIME OF PRODUCTS
• INCREASE THE SUCCESS RATE OF PRODUCT TRANSACTIONS
GOVERNMENTAL USE - PROACTIVE ACTIONS
• PROJECT VOTE SMART R –POLITICAL SCIENCE, SOCIAL, & ECONOMIC DATA
• PVSR IS A SOFTWARE TO ANALYZE & CREATE USEFUL INFORMATION
• GOOGLE FLU TRENDS (GFT) : BUILDING EFFECTIVE STRATEGY
• INFLUENZA AFFECT 5-20% OF THE U. S. POPULATION EVERY YEAR, RESULTING IN OVER 200,000 HOSPITALIZATIONS.
DUMBILL, E. (2013). MAKING SENSE OF BIG DATA. BIG DATA, 1(1), 1-2.
ACKNOWLEDGEMENT
UOFL SCIFINDER
UOFL PUBMED
DR. MANJU AHUJA
PROFESSIONAL MBA COHORT
COLLEGE OF BUSINESS
REFERENCES
ABSTRACTS OF PAPERS, 250TH ACS NATIONAL MEETING & EXPOSITION, BOSTON, MA, UNITED STATES, AUGUST 16-20, 2015 (2015), ANYL-28.
BOYD, D., & CRAWFORD, K. (2012). CRITICAL QUESTIONS FOR BIG DATA: PROVOCATIONS FOR A CULTURAL, TECHNOLOGICAL, AND SCHOLARLY
PHENOMENON. INFORMATION, COMMUNICATION & SOCIETY, 15(5), 662-679.
CHEN, H., CHIANG, R. H., & STOREY, V. C. (2012). BUSINESS INTELLIGENCE
AND ANALYTICS: FROM BIG DATA TO BIG IMPACT. MIS QUARTERLY, 36(4), 1165-1188.
DUMBILL, E. (2013). MAKING SENSE OF BIG DATA. BIG DATA, 1(1), 1-2.
DC'S DIGITAL UNIVERSE STUDY, SPONSORED BY EMC CORPORATION , DECEMBER 2012 BIOTECHNOLOGY: AN INDIAN JOURNAL (2014), 10(15, PT.. 4), 8811-8816.