Big data gaurav

362
BUMPER

description

Big Data Class 1

Transcript of Big data gaurav

Page 1: Big data gaurav

BUMPER

Page 2: Big data gaurav
Page 3: Big data gaurav

Understanding Big Data

Class 1Introduction to Big Data

Page 4: Big data gaurav

Understanding Big Data

Business Applications of Big Data

Class 1Introduction to Big Data

Page 5: Big data gaurav

Understanding Big Data

Business Applications of Big Data

Technologies for handling Big Data

Class 1Introduction to Big Data

Page 6: Big data gaurav

Understanding Big Data

Business Applications of Big Data

Technologies for handling Big Data

Big Data Management Systems – Databases & Warehouses

Class 1Introduction to Big Data

Page 7: Big data gaurav

Understanding Big Data

Business Applications of Big Data

Technologies for handling Big Data

Big Data Management Systems – Databases & Warehouses

Analytics & Big Data

Class 1Introduction to Big Data

Page 8: Big data gaurav

Topic 1

Class 1 Introduction to Big Data

Understanding Big Data

Page 9: Big data gaurav

What is Big Data?

Topic 1 – Understanding Big Data

Page 10: Big data gaurav

What is Big Data?

Topic 1 – Understanding Big Data

Structuring & Elements

Page 11: Big data gaurav

What is Big Data?

Topic 1 – Understanding Big Data

Structuring & Elements

Application in Business & Careers

Page 12: Big data gaurav

DATA

Personal Computers

Facebook

Twitter

YouTube

Google

ATMs

Drop Box

Picasa

Page 13: Big data gaurav

2002 5 Exabytes Online Data

2009

281 Exabytes Online Data(56 Times Increase)

Page 14: Big data gaurav
Page 15: Big data gaurav
Page 16: Big data gaurav

A pool of large-sized datasets to capture, store,

What is Big Data?

Page 17: Big data gaurav

A pool of large-sized datasets to capture, store,

What is Big Data?

search, share, transfer, analyse, and visualise

Page 18: Big data gaurav

A pool of large-sized datasets to capture, store,

What is Big Data?

search, share, transfer, analyse, and visualiserelated information or data within an acceptable elapsed time.

Page 19: Big data gaurav

Data = Information

Page 20: Big data gaurav

Data = InformationInformation = Insight

Page 21: Big data gaurav

• Every second, consumers make 10,000 payment card transactions worldwide

Page 22: Big data gaurav

• Every second, consumers make 10,000 payment card transactions worldwide

• Every hour, Walmart handles more than 1 million customer transactions

Page 23: Big data gaurav

• Every second, consumers make 10,000 payment card transactions worldwide

• Every hour, Walmart handles more than 1 million customer transactions

• Everyday Twitter’s users post 500 million tweets per day

Page 24: Big data gaurav

• Every second, consumers make 10,000 payment card transactions worldwide

• Every hour, Walmart handles more than 1 million customer transactions

• Everyday Twitter’s users post 500 million tweets per day

• Facebook users post 2.7 billion likes and comments in a day

Page 25: Big data gaurav

BIG DATA

Is a new datachallenge that

requiresleveraging

existingsystems

differently

Page 26: Big data gaurav

BIG DATA

Is a new datachallenge that

requiresleveraging

existingsystems

differently

Is classified in terms of:Volume (terabytes, records,

transactions)Variety (internal, external, behavioural, or/and social)Velocity (near or real-time

assimilation)

Page 27: Big data gaurav

BIG DATA

Is a new datachallenge that

requiresleveraging

existingsystems

differently

Is classified in terms of:Volume (terabytes, records,

transactions)Variety (internal, external, behavioural, or/and social)Velocity (near or real-time

assimilation)

Is usually unstructured

and qualitative in

Nature

Page 28: Big data gaurav
Page 29: Big data gaurav

• Understanding target customer

Advantages of Studying Big Data:

Page 30: Big data gaurav

• Understanding target customer

• Cutting down expenditures in the healthcare

Advantages of Studying Big Data:

Page 31: Big data gaurav

• Understanding target customer

• Cutting down expenditures in the healthcare

• Increase in operating margins in retail

Advantages of Studying Big Data:

Page 32: Big data gaurav

• Understanding target customer

• Cutting down expenditures in the healthcare

• Increase in operating margins in retail

• Profits with improvements in operational efficiency

Advantages of Studying Big Data:

Page 33: Big data gaurav

• Sports

Industries that Benefit:

Page 34: Big data gaurav

• Sports

• Science and Research

Industries that Benefit:

Page 35: Big data gaurav

• Sports

• Science and Research

• Security and Law Enforcement

Industries that Benefit:

Page 36: Big data gaurav

• Sports

• Science and Research

• Security and Law Enforcement

• Financial Trading

Industries that Benefit:

Page 37: Big data gaurav

• Procurement

Departments that can Benefit:

Page 38: Big data gaurav

• Procurement• Product Development

Departments that can Benefit:

Page 39: Big data gaurav

• Procurement• Product Development• Manufacturing

Departments that can Benefit:

Page 40: Big data gaurav

• Procurement• Product Development• Manufacturing• Distribution

Departments that can Benefit:

Page 41: Big data gaurav

• Procurement• Product Development• Manufacturing• Distribution• Marketing

Departments that can Benefit:

Page 42: Big data gaurav

• Procurement• Product Development• Manufacturing• Distribution• Marketing• Price Management

Departments that can Benefit:

Page 43: Big data gaurav

• Procurement• Product Development• Manufacturing• Distribution• Marketing• Price Management• Merchandising

Departments that can Benefit:

Page 44: Big data gaurav

• Procurement• Product Development• Manufacturing• Distribution• Marketing• Price Management• Merchandising• Sales

Departments that can Benefit:

Page 45: Big data gaurav

• Procurement• Product Development• Manufacturing• Distribution• Marketing• Price Management• Merchandising• Sales• Store operations

Departments that can Benefit:

Page 46: Big data gaurav

• Procurement• Product Development• Manufacturing• Distribution• Marketing• Price Management• Merchandising• Sales• Store operations• Human Resources

Departments that can Benefit:

Page 47: Big data gaurav

Flu Indications & WarningsMassive Data Collection

Analyse Collected

Data

Early Warnings for Flu Plague

Page 48: Big data gaurav

Social Data from Networking Sites reveals Behavioural Patterns

Page 49: Big data gaurav

Use Big Data for Growth & Value Addition

Page 50: Big data gaurav

RECAP

What is Big Data, its advantages and various sources

Page 51: Big data gaurav

BUMPER

Page 52: Big data gaurav

BUMPER

Page 53: Big data gaurav
Page 54: Big data gaurav

Topic 1

Class 1 - Introduction to Big Data

Understanding Big Data

Page 55: Big data gaurav

What is Big Data?

Class 1 - Introduction to Big Data

Page 56: Big data gaurav

What is Big Data?

Class 1 - Introduction to Big Data

Structuring & Elements

Page 57: Big data gaurav

What is Big Data?

Class 1 - Introduction to Big Data

Structuring & Elements

Application in Business & Careers

Page 58: Big data gaurav

How do I choose a book, of the millions available on my favorite sites or stores?

How can I use the vast amount of data and information I come across?

Page 59: Big data gaurav

How do I keep myself updated of events, news?

Which news articles should I read?

How do I choose a book, of the millions available on my favorite sites or stores?

How can I use the vast amount of data and information I come across?

Page 60: Big data gaurav
Page 61: Big data gaurav

Formats of Data:

Page 62: Big data gaurav

Formats of Data:

Page 63: Big data gaurav

Formats of Data:

Page 64: Big data gaurav

Formats of Data:

Page 65: Big data gaurav

Internal – Organisational or enterprise data

Sources of Data:

External - Social Data from the internet or Government

Page 66: Big data gaurav

Structured Data

Unstructured Data

Semi-Structure

d Data

BIG DATA

Page 67: Big data gaurav

Structured Data

Page 68: Big data gaurav

• Has a predefined format

Features of Structured Data:

Page 69: Big data gaurav

• Has a predefined format

• Resides in fixed fields within a record

Features of Structured Data:

Page 70: Big data gaurav

• Has a predefined format

• Resides in fixed fields within a record

• Has their attributes mapped

Features of Structured Data:

Page 71: Big data gaurav

• Has a predefined format

• Resides in fixed fields within a record

• Has their attributes mapped

• Used to report against predetermined data types

Features of Structured Data:

Page 72: Big data gaurav

Sources of Structured Data:

• Relational databases

Page 73: Big data gaurav

Sources of Structured Data:

• Relational databases

• Flat files in record format

Page 74: Big data gaurav

Sources of Structured Data:

• Relational databases

• Flat files in record format

• Multidimensional databases

Page 75: Big data gaurav

Sources of Structured Data:

• Relational databases

• Flat files in record format

• Multidimensional databases

• Legacy databases

Page 76: Big data gaurav

Unstructured Data

Page 77: Big data gaurav

Sources of Unstructured Data:

• Organisational Data

Page 78: Big data gaurav

Sources of Unstructured Data:

• Organisational Data

• Social Media

Page 79: Big data gaurav

Sources of Unstructured Data:

• Organisational Data

• Social Media

• Mobile Data

Page 80: Big data gaurav
Page 81: Big data gaurav

Challenges of Using Unstructured Data:

• Difficulty and time consumption in making sense

Page 82: Big data gaurav

Challenges of Using Unstructured Data:

• Difficulty and time consumption in making sense

• Difficulty in combining and linking unstructured data to more structured information

Page 83: Big data gaurav

Challenges of Using Unstructured Data:

• Difficulty and time consumption in making sense

• Difficulty in combining and linking unstructured data to more structured information

• Cost-addition in terms of the storage wastage and human resource needed

Page 84: Big data gaurav
Page 85: Big data gaurav

Semi-Structured Data

Page 86: Big data gaurav

Sources of Semi-Structured data:

• Database systems

Page 87: Big data gaurav

Sources of Semi-Structured data:

• Database systems

• File systems like Web data and bibliographic data

Page 88: Big data gaurav

Sources of Semi-Structured data:

• Database systems

• File systems like Web data and bibliographic data

• Data exchange formats like scientific data

Page 89: Big data gaurav

Sl. No Name E-mail

1. Sam Jacobs [email protected]

2. First Name David [email protected]

Last Name Brown

Page 90: Big data gaurav
Page 91: Big data gaurav

Volume

Page 92: Big data gaurav

Velocity

Page 93: Big data gaurav
Page 94: Big data gaurav

Variety

Page 95: Big data gaurav

What is Big Data?

Class 1 - Introduction to Big Data

Structuring & Elements

Application in Business & Careers

Page 96: Big data gaurav

Big Data Application In Business Analytics

Page 97: Big data gaurav
Page 98: Big data gaurav
Page 99: Big data gaurav

What are the areas where Big Data can be applied?

Page 100: Big data gaurav

Transportation

Provides improved traffic information and autonomous features

Page 101: Big data gaurav

Education

Through innovative approaches for teachers to analyze students

Page 102: Big data gaurav

Travel

Apply analytics to pricing, inventory, and advertising to improve customer experiences

Page 103: Big data gaurav

Governments

To make informed decisions for fraud management, discover unknown threats, ensure security of global supply chain

Page 104: Big data gaurav

Healthcare

To ensure clinical protocols that will ensure the best health outcome for patients

Page 105: Big data gaurav

Careers in Big

Data

Page 106: Big data gaurav

BIG Career Opportunities

Page 107: Big data gaurav

Major Big Data Hiring Companies:

Product companies, e.g., Oracle

Technology drivers, e.g., Google

Services companies, e.g., EMC

Data analytics companies, e.g., Splunk

Page 108: Big data gaurav

The most common job titles in Big Data include:

Big Data Analyst

Page 109: Big data gaurav

The most common job titles in Big Data include:

Big Data Analyst Big Data Scientist

Page 110: Big data gaurav

The most common job titles in Big Data include:

Big Data Analyst Big Data Scientist

Big Data Developer

Page 111: Big data gaurav

Module 1Introduction to Big Data

Page 112: Big data gaurav

Module 1Introduction to Big Data

Big Data AnalystCertification Track

Big Data DeveloperCertification Track

Page 113: Big data gaurav

Module 1Introduction to Big Data

Big Data AnalystCertification Track

Big Data DeveloperCertification Track

Module 2Introduction to Analytics & R Programming

Module 3Data Analysis

Using R

Module 4Advanced Analytics Using R

Module 2Managing a

Big Data Ecosystem

Page 114: Big data gaurav

Module 1Introduction to Big Data

Big Data AnalystCertification Track

Big Data DeveloperCertification Track

Module 2Introduction to Analytics & R Programming

Module 3Data Analysis

Using R

Module 4Advanced Analytics Using R

Module 2Managing a

Big Data Ecosystem

Module 5Machine Learning Concepts

Module 3Storing &

Processing Data: HDFS & MapReduce

Module 4: Increasing

Efficiency with Hadoop Tools

Module 5Additional

Hadoop Tools: ZooKeeper,

Sqoop, Flume,

YARN & Storm

Page 115: Big data gaurav

Module 1Introduction to Big Data

Big Data AnalystCertification Track

Big Data DeveloperCertification Track

Module 2Introduction to Analytics & R Programming

Module 3Data Analysis

Using R

Module 4Advanced Analytics Using R

Module 2Managing a

Big Data Ecosystem

Module 5Machine Learning Concepts

Module 3Storing &

Processing Data: HDFS & MapReduce

Module 4: Increasing

Efficiency with Hadoop Tools

Module 5Additional

Hadoop Tools: ZooKeeper,

Sqoop, Flume,

YARN & Storm Module 6

Social Media, Mobile

Analytics & Visualisation

Module 7 Industry

Applications of Big Data

Applications

Module 6Leveraging NoSQL

& Hadoop: Real Time, Security &

Cloud

Module 7Commercial

Hadoop Distribution &

Management Tools

Page 116: Big data gaurav

Module 1Introduction to Big Data

Big Data AnalystCertification Track

Big Data DeveloperCertification Track

Module 2Introduction to Analytics & R Programming

Module 3Data Analysis

Using R

Module 4Advanced Analytics Using R

Module 2Managing a

Big Data Ecosystem

Module 5Machine Learning Concepts

Module 3Storing &

Processing Data: HDFS & MapReduce

Module 4: Increasing

Efficiency with Hadoop Tools

Module 5Additional

Hadoop Tools: ZooKeeper,

Sqoop, Flume,

YARN & Storm Module 6

Social Media, Mobile

Analytics & Visualisation

Module 7 Industry

Applications of Big Data

Applications

Module 6Leveraging NoSQL

& Hadoop: Real Time, Security &

Cloud

Module 7Commercial

Hadoop Distribution &

Management ToolsComplete

Project

Wrox Certified Big Data Analyst/ Developer

Page 117: Big data gaurav

Technical Skills Required for a Big Data Analyst:

Page 118: Big data gaurav

Technical Skills Required for a Big Data Analyst:

• Handle & analyse massive data sets using MapReduce

Page 119: Big data gaurav

Technical Skills Required for a Big Data Analyst:

• Handle & analyse massive data sets using MapReduce

• Hadoop & components Hbase & Hive

Page 120: Big data gaurav

Technical Skills Required for a Big Data Analyst:

• Handle & analyse massive data sets using MapReduce

• Hadoop & components Hbase & Hive

• SQL and NoSQL languages such as Impala, Hive and Pig

Page 121: Big data gaurav

Technical Skills Required for a Big Data Analyst:

• Handle & analyse massive data sets using MapReduce

• Hadoop & components Hbase & Hive

• SQL and NoSQL languages such as Impala, Hive and Pig

• Analytical tools such as SAS, R, Tableau

Page 122: Big data gaurav

Technical Skills Required for a Big Data Analyst:

• Handle & analyse massive data sets using MapReduce

• Hadoop & components Hbase & Hive

• SQL and NoSQL languages such as Impala, Hive and Pig

• Analytical tools such as SAS, R, Tableau

• Statistical techniques to implement text analytics solutions

Page 123: Big data gaurav

Technical Skills Required for a Big Data Analyst:

• Handle & analyse massive data sets using MapReduce

• Hadoop & components Hbase & Hive

• SQL and NoSQL languages such as Impala, Hive and Pig

• Analytical tools such as SAS, R, Tableau

• Statistical techniques to implement text analytics solutions

• Data handling and manipulation techniques

Page 124: Big data gaurav

Technical Skills Required for a Big Data Analyst:

• Handle & analyse massive data sets using MapReduce

• Hadoop & components Hbase & Hive

• SQL and NoSQL languages such as Impala, Hive and Pig

• Analytical tools such as SAS, R, Tableau

• Statistical techniques to implement text analytics solutions

• Data handling and manipulation techniques

• Generate client ready dashboards, reports and visualizations

Page 125: Big data gaurav

Soft Skills Required:

• Strong written & verbal communication skills

Page 126: Big data gaurav

Soft Skills Required:

• Strong written & verbal communication skills

• Analytical Ability

Page 127: Big data gaurav

Soft Skills Required:

• Strong written & verbal communication skills

• Analytical Ability

• Basic understanding of how a business works

Page 128: Big data gaurav

Future of Big Data

Page 129: Big data gaurav

RECAP

What are the various types and structures of Big Data and the elements that form it

What are the business applications of Big Data and the career opportunities associated

Page 130: Big data gaurav

BUMPER

Page 131: Big data gaurav

BUMPER

Page 132: Big data gaurav

BIG DATA

Page 133: Big data gaurav

Topic 2Business Applications of Big Data

Class 1: Introduction to Big Data

Page 134: Big data gaurav
Page 135: Big data gaurav
Page 136: Big data gaurav

Social Media

Page 137: Big data gaurav

Topic 2Business Applications of Big Data

Significance of Social Network Data

Page 138: Big data gaurav

Topic 2Business Applications of Big Data

Significance of Social Network Data

Financial Fraud & Big Data

Page 139: Big data gaurav

Topic 2Business Applications of Big Data

Significance of Social Network Data

Financial Fraud & Big Data

Fraud Detection in Insurance

Page 140: Big data gaurav

Topic 2Business Applications of Big Data

Significance of Social Network Data

Financial Fraud & Big Data

Fraud Detection in Insurance

Use in Retail Industry

Page 141: Big data gaurav

Significance of Social Network Data

What is Social Network Data?

Page 142: Big data gaurav

Significance of Social Network Data

What is Social Network Data?

What is Social Network Analysis?

Page 143: Big data gaurav

Significance of Social Network Data

What is Social Network Data?

What is Social Network Analysis?

What are the uses of Social Network Data Analysis?

Page 144: Big data gaurav

Significance of Social Network Data

What is Social Network Data?

What is Social Network Analysis?

What are the uses of Social Network Data Analysis?

What is Sentiment Analysis?

Page 145: Big data gaurav

DATA

Page 146: Big data gaurav
Page 147: Big data gaurav

Social Media

AGE

Page 148: Big data gaurav

Social Media

AGE

GENDER

Page 149: Big data gaurav

Social Media

AGE

GENDER

LOCATION

Page 150: Big data gaurav
Page 151: Big data gaurav

Significance of Social Network Data

What is Social Network Data?

What is Social Network Analysis?

What are the uses of Social Network Data Analysis?

What is Sentiment Analysis?

Page 152: Big data gaurav

Social Network Analysis (SNA)

SocialNetwork

Page 153: Big data gaurav

Social Network Analysis (SNA)

SocialNetwork

DATA

Page 154: Big data gaurav

Analysis

Social Network Analysis (SNA)

SocialNetwork

DATA

Page 155: Big data gaurav
Page 156: Big data gaurav

Total Number of calls

Page 157: Big data gaurav

Total Number of calls

Total Number of SMS

Page 158: Big data gaurav

Structure of a Caller’s Social Network

Page 159: Big data gaurav

Social Network Site

Page 160: Big data gaurav

Social Network Site

Page 161: Big data gaurav

Social Network Site

Page 162: Big data gaurav

Social Network Site

Page 163: Big data gaurav

Social Network Site

Page 164: Big data gaurav

Social Network Site

Page 165: Big data gaurav

Social Network Site

Page 166: Big data gaurav

Social Network Site

Page 167: Big data gaurav

Social Network Site

Page 168: Big data gaurav

Social Networking Analysis a Big Data Problem

Page 169: Big data gaurav

Significance of Social Network Data

What is Social Network Data?

What is Social Network Analysis?

What are the uses of Social Network Data Analysis?

What is Sentiment Analysis?

Page 170: Big data gaurav

Social Network Analysis (SNA)

Business Intelligence

Page 171: Big data gaurav

Social Network Analysis (SNA)

Business Intelligence

Marketing

Page 172: Big data gaurav

Social Network Analysis (SNA)

Business Intelligence

Marketing

Product Design & Development

Page 173: Big data gaurav

Social Network Analysis (SNA)

Business Intelligence

Marketing

Product Design & Development

Page 174: Big data gaurav
Page 175: Big data gaurav
Page 176: Big data gaurav

Customer Relationship Management (CRM)

Page 177: Big data gaurav
Page 178: Big data gaurav
Page 179: Big data gaurav
Page 180: Big data gaurav
Page 181: Big data gaurav

A• E• F

B• A• D

C• H• OGroup

AGroup GH

Page 182: Big data gaurav

Provides new contexts in which decisions are data driven, not opinion driven

Social Network Data Analysis

Page 183: Big data gaurav

Provides new contexts in which decisions are data driven, not opinion driven

Organizations to shift goals to maximize profitability of customer’s network

Social Network Data Analysis

Page 184: Big data gaurav

Provides new contexts in which decisions are data driven, not opinion driven

Organizations to shift goals to maximize profitability of customer’s network

Organizations to identify highly connected customers

Social Network Data Analysis

Page 185: Big data gaurav

Organizations to lure highly connected customers with free trials and solicit their feedback

Social Network Data Analysis

Page 186: Big data gaurav

Organizations to lure highly connected customers with free trials and solicit their feedback

Organizations to encourage internal customers to become more active

Social Network Data Analysis

Page 187: Big data gaurav

Social Network Analysis (SNA)

Business Intelligence

Marketing

Product Design & Development

Page 188: Big data gaurav
Page 189: Big data gaurav
Page 190: Big data gaurav
Page 191: Big data gaurav
Page 192: Big data gaurav
Page 193: Big data gaurav
Page 194: Big data gaurav
Page 195: Big data gaurav
Page 196: Big data gaurav
Page 197: Big data gaurav
Page 198: Big data gaurav

Social Data

Page 199: Big data gaurav

Social Data

Analysis

Page 200: Big data gaurav

Analyze Media Communication

Page 201: Big data gaurav
Page 202: Big data gaurav

Social Network Analysis (SNA)

Business Intelligence

Marketing

Product Design & Development

Page 203: Big data gaurav
Page 204: Big data gaurav
Page 205: Big data gaurav

System

Page 206: Big data gaurav

System

Page 207: Big data gaurav

DATA

System

Page 208: Big data gaurav

Significance of Social Network Data

What is Social Network Data?

What is Social Network Analysis?

What are the uses of Social Network Data Analysis?

What is Sentiment Analysis?

Page 209: Big data gaurav
Page 210: Big data gaurav

Product Development and Offerings

Page 211: Big data gaurav

Sentiment Analysis

Marketers Business Professionals

Page 212: Big data gaurav

Followers

Page 213: Big data gaurav

3,46,259 Followers

2,73,591Likes

But is one of the most disliked airlines. Why?

Page 214: Big data gaurav

SummaryRECAP

What is social network data and analysisWhat are its uses and values

Page 215: Big data gaurav

BUMPER

Page 216: Big data gaurav

BUMPER

Page 217: Big data gaurav

BIG DATA

Page 218: Big data gaurav

Topic 2Business Applications of Big Data

Class 1: Introduction to Big Data

Page 219: Big data gaurav

Topic 2Business Applications of Big Data

Significance of Social Network Data

Financial Fraud & Big Data

Fraud Detection in Insurance

Use in Retail Industry

Page 220: Big data gaurav

BANK

Page 221: Big data gaurav
Page 222: Big data gaurav

Common Financial Frauds Common Financial Frauds

Credit Card Frauds

Exchange or Return Policy Fraud

Personal Information Fraud

Page 223: Big data gaurav

understand customers ordering

patterns

Prevent Frauds

watch outFor red flags

Page 224: Big data gaurav

Big Data

Fraud

Page 225: Big data gaurav

Analyzing data

sample size Small

Can understand various patterns of the fraud

Analyzing data

sample size Large

Cannot understand various patterns of the fraud

• Size could not be increased, required huge investments in time and money

• Big Data techniques can overcome this challenge

Page 226: Big data gaurav

Big Data analytics can…

Run check on all data to identify fraudulent ones

Identify new ways of fraud and add to a set of fraud-prevention checks

Doesn’t impede customers with unnecessary polices and governance structures

Page 227: Big data gaurav

Fraud Detection in Real Time

BIG DATA

live transactions

sources of data

Page 228: Big data gaurav

BIG DATA

Historical Data Indicate fraud patterns

Checks to prevent real-

time fraud

Page 229: Big data gaurav

Real-time Analysis

Page 230: Big data gaurav

BIG DATA

Create comparisons

Drawing Maps & Graphs

Decisions and effective systems

BLOCK FRAUD

Page 231: Big data gaurav

Topic 2Business Applications of Big Data

The Significance of Social Network Data

Financial Fraud and Big Data

Fraud Detection in Insurance

Use of Big Data in the Retail Industry

Page 232: Big data gaurav

Insurance Company

Improve its ability to make decisions in real time when processing a new claim, thereby reducing the claim cycle time

Incurs a steady increase in the cost of litigation and fraudulent claims

Underwriters do not have required data at the right time to make the necessary decisions, further delaying processing time

Page 233: Big data gaurav

BIG DATA

Social MediaData

Note forunderwriter

Page 234: Big data gaurav

Social Media Triggers to identify Fraud

These glaring discrepancies reflect FRAUD.

In the claim - a customer might indicate that his or her car was destroyed in a flood

Documentation from the social media feed shows that the car was actually in another city on the day the flood occurred.

Page 235: Big data gaurav

Insurance Frauds

Have a huge cost implication on organization

Organizations prefer using Big Data analytics and other advanced technologies

Positive impact on customers as losses are transferred as higher premiums to customers

Page 236: Big data gaurav

Big Data analytics platform

Organizations are now able to analyze complex information and accident scenarios in minutes rather than days or months

INSURANCE

Page 237: Big data gaurav

Typically use small samples of data to analyze Method relies on the previously recorded fraud cases Every time a fraud based on new technique occurs,

insurance companies have to bear the consequences and the losses for the first time

The traditional method of identifying frauds works in independent silos

It is not capable of handling various sources of information from different channels and different functions in an integrated way

Fraud Detection Methods

Statistical Models

Page 238: Big data gaurav

Public

Data

Bank Statements

Legal Judgments

Criminal Records

Medical Bills

Page 239: Big data gaurav

Social Network Analysis (SNA)

Big Data can be used to create visibility into blind spots for businesses

SNA is an innovative and effective way to identify and detect frauds

Page 240: Big data gaurav
Page 241: Big data gaurav

SNA tool uses a mix of analytical methods

• Statistical methods

• Pattern analysis

• Link analysis

Page 242: Big data gaurav

When link analysis is used in fraud detection

• Looks for clusters of data • How those data clusters are linked to other

data clusters?• Public records are various data sources that

can be integrated into a model • The insurer can rate claims

Page 243: Big data gaurav

When link analysis is used in fraud detection

If the rating is high It indicates that the claim is

fraudulent

• known bad address• a suspicious provider • the vehicle was involved in many accidents with

multiple carriers.

Page 244: Big data gaurav

How fast does data arrive? 

Page 245: Big data gaurav

How much of unrequired data is there when it arrives?

Page 246: Big data gaurav

How deep should the analysis be before determining

the best accurate results?

Page 247: Big data gaurav

What type of user interface components need to be included

on the SNA dashboard?

Page 248: Big data gaurav

SNA method to detect fraud:Structured and unstructured data, from various sources fed into the ETL (Extract, Transform, and Load) toolThis data is then transformed and loaded into data warehouse

Analytics team uses information from various sources, scores risk of fraud and ranks likelihood of fraudInformation used can come from varied sources - prior belief, previous relationship, number of rejected claims etc.

Big Data technologies - text mining, sentiment analysis, content categorization, and social network analysis included into the fraud detection and predictive modeling mechanism.

Page 249: Big data gaurav

SNA method to detect fraud:

Depending on score of particular network, an alert is generated

Investigators can leverage this information and begin researching more on fraudulent claim

Issues of frauds identified are added into case system.

Page 250: Big data gaurav

Predictive analysis works with the concept that earlier the fraud detection, the lesser the loss incurred by a business.

Page 251: Big data gaurav
Page 252: Big data gaurav

Fraud detection

BIG DATA

Text analytics Sentiment analysis

Predictive analytics

Page 253: Big data gaurav

Predictive Analytics Technology

Claim adjusters write lengthy reports while investigating a claim. Clues are hidden in reports that claims adjuster would not notice

Computing system based on business rules highlights clues for possible fraud

Fraud detection system spot these discrepancies and flag claim as fraudulent

Page 254: Big data gaurav

Customer Relationship

Management (CRM)

Page 255: Big data gaurav
Page 256: Big data gaurav

The following briefly describes how a Social CRM process works:

Uses organization’s existing CRM to gather data from various social media platforms

Uses “listening” tool to extract data from social chatter that acts as reference data for existing data in organization’s CRM

Reference data along with information stored in CRM fed into a case management system

Case management system analyzes information on basis of organization’s business rules and sends response

Response from claim management system on fraudulent claim is confirmed by investigators

Page 257: Big data gaurav

Class 1: Introduction to Big Data

The Significance of Social Network Data

Financial Fraud and Big Data

Fraud Detection in Insurance

Use of Big Data in Retail Industry

Page 258: Big data gaurav

Use of Big Data in Retail Industry

BIG DATA

MALL

Page 259: Big data gaurav

Use of Big Data in Retail Industry

How many basic tees did we sell today?

What time of the year do we sell most leggings?

What else has customer X bought?

what kind of coupons can we send to customer X?

Page 260: Big data gaurav

Use of Big Data in Retail Industry

MALLMALL MALL

MALLMALL MALL

Page 261: Big data gaurav

Use of Big Data in Retail Industry

MALL

In-store Sales Online Sales

Page 262: Big data gaurav

Use of Big Data in Retail Industry

MALLMALL

Page 263: Big data gaurav

Use of Big Data in Retail Industry

Page 264: Big data gaurav

Most of the Big Data is just not required

and not useful either

• some information will have long-term strategic

value

• some will be useful only for immediate and tactical

use

• some data won’t be used for anything at all

Page 265: Big data gaurav

Use of RFID Data in Retail(Radio Frequency Identification)

A RFID tag refers to a small tag that includes a unique code to identify a product like a UPC code. This tag is placed on shipping pallets or product packages as an adjacent image.

Page 266: Big data gaurav

In addition to a bar code, an RFID: 

Specifies pallet as allotted to a precise and exclusive set of computer systems

Helps in finding situations where items have no units left in store

Specifies number of units of each item remaining in store, and thereby raises an alarm when restocking required

Better tracking of products by differentiating products which are out of stock and products that are available on shelf.

Page 267: Big data gaurav

Use of RFID Data in Retail

• saves time

• reduces labor

• enhances the visibility of products throughout the production-delivery life cycle

• saves costs

Page 268: Big data gaurav

What is the significance of Social Data

Network Data, Financial Fraud, Fraud

Detection in Insurance and the uses of Big

Data in Retail Industry

What are the uses of Big Data in retail

Industry, RFID Data and its advantages

RECAP

Page 269: Big data gaurav

BUMPER

Page 270: Big data gaurav

BUMPER

Page 271: Big data gaurav
Page 272: Big data gaurav

Topic 3

Class 1 - Introduction to Big Data

Technologies for Handling Big Data

Page 273: Big data gaurav

Distribution & Computing for Big Data

Topic 3 – Technologies for Handling Big Data

Introducing Hadoop

Cloud Computing & In-Memory Technologies for Big Data

Page 274: Big data gaurav

DATAPROCESSIN

G

Analysed

Page 275: Big data gaurav

Distributed & Parallel Computing

BIG DATA

HADOOPCLOUD

In-Memory Computing

Page 276: Big data gaurav
Page 277: Big data gaurav
Page 278: Big data gaurav
Page 279: Big data gaurav
Page 280: Big data gaurav

Transmitter

Receiver

Page 281: Big data gaurav

Transmitter

Receiver

Hello?

Page 282: Big data gaurav

Transmitter

Receiver

Hello?

Page 283: Big data gaurav

Transmitter

Receiver

Hello?

I can’t hear you…

Page 284: Big data gaurav

Slowdown in system performance

Issues caused by Latency:

Page 285: Big data gaurav

Slowdown in system performance

Data management

Issues caused by Latency:

Page 286: Big data gaurav

Slowdown in system performance

Data management

Internal organisational communication

Issues caused by Latency:

Page 287: Big data gaurav

Slowdown in system performance

Data management

Internal organisational communication

External communication

Issues caused by Latency:

Page 288: Big data gaurav

Distributed and Parallel processing

Page 289: Big data gaurav

Distributed and Parallel processingtechniques process large amounts of

Page 290: Big data gaurav

Distributed and Parallel processingtechniques process large amounts of

data and also deal with latency.

Page 291: Big data gaurav

Distributed System

A collection of independent computer systems

Page 292: Big data gaurav

Distributed System

A collection of independent computer systems

that are connected via a network

Page 293: Big data gaurav

Distributed System

A collection of independent computer systems

that are connected via a network

to accomplish a specific task.

Page 294: Big data gaurav

Parallel System

A computer system that has multiple processing units attached to it.

Page 295: Big data gaurav

Parallel Computing Techniques

Clusters or Grids

Page 296: Big data gaurav

Parallel Computing Techniques

Massively Parallel Processing (MPP)

Page 297: Big data gaurav

Parallel Computing Techniques

High-Performance Computing (HPC)

Page 298: Big data gaurav

Public Cloud vs Private Cloud

Page 299: Big data gaurav

Public Cloud vs Private Cloud

Page 300: Big data gaurav

Public Cloud vs Private Cloud

Page 301: Big data gaurav

Public Cloud vs Private Cloud

Page 302: Big data gaurav

Distribution & Computing for Big Data

Topic 3 – Technologies for Handling Big Data

Introducing Hadoop

Cloud Computing & In-Memory Technologies for Big Data

Page 303: Big data gaurav
Page 304: Big data gaurav

Features of Hadoop:

• Works on multiple machines without sharing memory

Page 305: Big data gaurav

Features of Hadoop:

• Works on multiple machines without sharing memory

• Distributes data over different servers

Page 306: Big data gaurav

Features of Hadoop:

• Works on multiple machines without sharing memory

• Distributes data over different servers

• Can track data stored on different servers

Page 307: Big data gaurav

Features of Hadoop:

• Works on multiple machines without sharing memory

• Distributes data over different servers

• Can track data stored on different servers

• Runs all available servers in parallel

Page 308: Big data gaurav

Features of Hadoop:

• Works on multiple machines without sharing memory

• Distributes data over different servers

• Can track data stored on different servers

• Runs all available servers in parallel

• Keeps multiple copies of data

Page 309: Big data gaurav
Page 310: Big data gaurav
Page 311: Big data gaurav

Hadoop Cluster

Gateway Node

Page 312: Big data gaurav

Hadoop Cluster

Gateway Node

Switch

Page 313: Big data gaurav

Hadoop Cluster

Gateway Node

Switch

Server 1 Server 2

Page 314: Big data gaurav

Hadoop Cluster

Gateway Node

Switch

Server 1 Server 2 Server 3 Server 4 Server 5

Page 315: Big data gaurav

Hadoop Cluster

Gateway Node

Switch

Server 1 Server 2 Server 3 Server 4 Server 5

Page 316: Big data gaurav

MapReduce

Page 317: Big data gaurav
Page 318: Big data gaurav

How does Hadoop work?

• Data of an organisation is loaded into the Hadoop software

Page 319: Big data gaurav

How does Hadoop work?

• Data of an organisation is loaded into the Hadoop software

• Data is divided into different pieces & sent to different servers

Page 320: Big data gaurav

How does Hadoop work?

• Data of an organisation is loaded into the Hadoop software

• Data is divided into different pieces & sent to different servers

• Hadoop keeps track of the data by sending a job code to all the servers that store the relevant piece of data

Page 321: Big data gaurav

How does Hadoop work?

• Data of an organisation is loaded into the Hadoop software

• Data is divided into different pieces & sent to different servers

• Hadoop keeps track of the data by sending a job code to all the servers that store the relevant piece of data

• Each server applies the job code to the portion of data stored on it and returns results

Page 322: Big data gaurav

Indexing Job

Hadoop Software

Server 1 Server 2 Server 3

Job Code 1 +Processing Data

Job Code 2 +Processing Data

Job Code 3 +Processing Data

Result

Page 323: Big data gaurav

EXAMPLE:

user_id user_name

Page 324: Big data gaurav

EXAMPLE:

user_id user_name city_name service_provider_na

me and call_time

Page 325: Big data gaurav

user_id user_name city_name service_provider_name and call_time

Page 326: Big data gaurav

RECAP

Various aspects of distribution and computing for Big Data

Hadoop as a technology for handling Big Data

Page 327: Big data gaurav

BUMPER

Page 328: Big data gaurav

BUMPER

Page 329: Big data gaurav
Page 330: Big data gaurav

Topic 3

Class 1 - Introduction to Big Data

Technologies for Handling Big Data

Page 331: Big data gaurav

Distribution & Computing for Big Data

Topic 3 – Technologies for Handling Big Data

Introducing Hadoop

Cloud Computing & In-Memory Technologies for Big Data

Page 332: Big data gaurav
Page 333: Big data gaurav
Page 334: Big data gaurav
Page 335: Big data gaurav

Features of Cloud Computing:

• Scalability

Page 336: Big data gaurav

Features of Cloud Computing:

• Scalability• Elasticity

Page 337: Big data gaurav

Features of Cloud Computing:

• Scalability• Elasticity• Resource Pooling

Page 338: Big data gaurav

Features of Cloud Computing:

• Scalability• Elasticity• Resource Pooling• Self Service

Page 339: Big data gaurav

Features of Cloud Computing:

• Scalability• Elasticity• Resource Pooling• Self Service• Low Costs

Page 340: Big data gaurav

Features of Cloud Computing:

• Scalability• Elasticity• Resource Pooling• Self Service• Low Costs• Fault Tolerance

Page 341: Big data gaurav

What are Cloud Deployment Modules?

Page 342: Big data gaurav

PRIVATE CLOUD

Page 343: Big data gaurav
Page 344: Big data gaurav

Categories of Cloud Services:

Page 345: Big data gaurav
Page 346: Big data gaurav
Page 347: Big data gaurav
Page 348: Big data gaurav

Other Amazon Web Services:

• Amazon Elastic MapReduce

Page 349: Big data gaurav

Other Amazon Web Services:

• Amazon Elastic MapReduce• Amazon Dynamo DB

Page 350: Big data gaurav

Other Amazon Web Services:

• Amazon Elastic MapReduce• Amazon Dynamo DB• Amazon S3

Page 351: Big data gaurav

Other Amazon Web Services:

• Amazon Elastic MapReduce• Amazon Dynamo DB• Amazon S3• Amazon High-Performance Computing

Page 352: Big data gaurav

Other Amazon Web Services:

• Amazon Elastic MapReduce• Amazon Dynamo DB• Amazon S3• Amazon High-Performance Computing• Amazon RedShift

Page 353: Big data gaurav

Google Web Services:

• Google Compute Engine

Page 354: Big data gaurav

Google Web Services:

• Google Compute Engine

• Google Big Query

Page 355: Big data gaurav

Google Web Services:

• Google Compute Engine

• Google Big Query

• Google Prediction API

Page 356: Big data gaurav

Windows Azure

Page 357: Big data gaurav

In-memory technology makes it possible for

Page 358: Big data gaurav

In-memory technology makes it possible for

departments or business units

Page 359: Big data gaurav

In-memory technology makes it possible for

departments or business units

to take the part of the organizational data

Page 360: Big data gaurav

In-memory technology makes it possible for

departments or business units

to take the part of the organizational data

that is relevant to their needs and process it locally.

Page 361: Big data gaurav

RECAP

In this session we discussed cloud computing & various in-memory technologies for handling Big Data.

Page 362: Big data gaurav

BUMPER