Big data application using hadoop in cloud [Smart Refrigerator]
-
Upload
pushkar-bhandari -
Category
Technology
-
view
171 -
download
1
Transcript of Big data application using hadoop in cloud [Smart Refrigerator]
Presentation on….
Smart Refrigerator Concept with Cloud and Big Data technique
By:Pushkar Bhandari
Motivation
Sensors are used at many places around the world but in India the sensor data approach is not emerged at level were we can use it to detect the malfunction.
The sensor applications can be thought at enormous level including HCO, Chemical factories, Household appliances, Manufacturing Dept. etc.
For safety purpose and prediction of any type of disaster on right time can help to prevent future loss.
Introduction
Aim of the proposed work is to build a model which will comprise of different objects embedded with sensors and this sensors will be used to generate data which will be used for analytics or development of organization.
Proposed work
VM1
VM6VM5VM4
VM2 VM3
Service
Refrigerator[ Sensors deployed ]
Private cloudThird party vendor
1-Daily Ice-cream2-Dairy product co.3- - - - - - - - - - --
Registered accountOf 3rd party vendors
To access data either –1-Discount offer2-Payment
Data sensed
Mediator
Service Provider
Griblink
Fig. 1
User
Refrigerator
Wi-Fi availabl
e
Registered user
Creating account on cloud
Should I connect to
internet
Sensors generating data & storing on cloud
Cloud B
A
Service
Offer to access
Yes
Yes
No
No
Normal Execution
AnalyticsIf any malfunction
detectedAnalysis of malfunction
Generate notice & send to user
Accepted ?
3rd party vendors
Registered account
Users
offer
User accepted
?
No
locate
Data owner is 3rd party vendor
Yes
Data analysis by 3rd party vendor
Background
What is Sensing as a service
A sensor is a transducer whose purpose is to sense some characteristic of its environs.
Using sensors to generate the data related to different characteristics and store the data for analytics and service is called as Sensing as a service.
Cloud Computing
What is cloud computing ?Cloud Computing refers to manipulating, configuring, and accessing the applications online. It offers online data storage, infrastructure and application.
Cloud Computing provides us a means by which we can access the applications as utilities, over the internet. It allows us to create, configure, and customize the business applications online.
What is Cloud ?
The term Cloud refers to a Network or Internet. In other words, we can say that Cloud is something, which is present at remote location. Cloud can provide services over network, i.e., on public networks or on private networks, i.e., WAN, LAN or VPN.
Continued…
We need not to install a piece of software on our local PC and this is how the cloud computing overcomes platform dependency issues. Hence, the Cloud Computing is making our business application mobile and collaborative.
Types of clouds…
Public cloudThe Public Cloud allows systems and services to be easily accessible to the general public. Public cloud may be less secure because of its openness, e.g., e-mail.
Continued…
Private cloudThe Private Cloud allows systems and services to be accessible within an organization. It offers increased security because of its private nature.
Continued…
Community cloudThe Community Cloud allows systems and services to be accessible by group of organizations.
Continued….
Hybrid cloudThe Hybrid Cloud is mixture of public and private cloud. However, the critical activities are performed using private cloud while the non-critical activities are performed using public cloud.
Service models
INFRASTRUCTURE AS A SERVICE
(IAAS)
IaaS provides access to fundamental resources such as physical machines, virtual machines, virtual storage, etc.
Continued…
PLATFORM AS A SERVICE
(PAAS)
PaaS provides the runtime environment for applications, development & deployment tools, etc.
Continued….
SOFTWARE AS A SERVICE (SAAS)
SaaS model allows to use software applications as a service to end users.
Continued…
In addition to the above main layer, some other layers are also introduced such as Database as a Service (DBaaS), Data as a Service (DaaS), Ethernet as a Service (EaaS), Network as a Service (NaaS), Identity and Policy Management as a Service (IPMaaS), and Sensing as a service (SaaS). In general, all these models are called XaaS, which means ‘X’ can be virtually anything. In this paper we discuss only, Sensing as a service (SaaS) model
Emerging Big DataEmployees' generating
data
User’s generating
data
Machines generating
data
In good old days, the data was brought to the processors to process the data.But today , as the data has grown scalable , processors are brought to the data to process.
• We constantly produce allot of data.• For example via social media , public transport and GPS.• But it goes way beyond that.• Daily we upload …….
55 Million pictures 340 Million Tweets1 Billion Documents
Total of 2.5 Quintillion of Data per day
Big Data Defined 3 V’s
Volume VelocityVariety
Others
Value Veracity
Traditional Approach
Big Data Powerful Computer
Processed by
Enterprise approach
Big DataPowerful Compute
r
Only so much data can be processed
Processing Limit
To Process Big Data
Breaking the Data
Big DataIs broken into pieces
Hadoop’s approach
Move Computation to the Data
Big Data
Computation
Computation
Computation
Computation
Computation
Combined result
Hadoop’s approach
What is Hadoop
Hadoop Framework of tools
Is a
Objective
HadoopRunning
applicationsOn Big Data
Supports
Hadoop Origins..
• Hadoop is a open source implementation based on GFS and Map-reduce from Google.
• Sanjay Ghemawat, Howard Gobioff and Shun- Tak Leung ( 2003 ): The Google File System.
• Jeffrey Dean and Sanjay Ghemawat ( 2004 ) : Map-Reduce-Simplified Data Processing on Large Clusters.
• Created by Doug Cutting and Michael Cafarella ( Yahoo ) in 2005.
• Yahoo donated the project to Apache’s in 2006.
Architecture
MapReduce
HDFS
Projects
Provides additional
functionality
HDFS is…
• A Distributed File System• Basically storage layer• Designed to reliably store data using
commodity hardware• Designed to overcome hardware failures• Intended for large files• Designed for batch inserts
HDFS – files and blocks
• Files are stored as a collection of blocks.• Blocks are 64 MB chunks of a file
[configurable].• Blocks are replicated on 3 nodes .• The NameNode (NN) manages
metadata about files and blocks.• The SecondaryNameNode (SNN) holds
a backup of NN data.• DataNodes (DN) store and serve blocks.
Hadoop’s Stack
MapReduce[Distributed programming model or
Computation framework ]
HDFSHadoop Distributed File System
HBase Pig Hive Cascadi
ngOozi
e
Scoop
Mahout
fig. 17
Distributed model and Linux based
Linux Linux LinuxLinux
Low cost computers
Works on Linux based
machines
Task trackers, Data nodes, Job tracker
Data Node
Task Tracker
Data Node
Task Tracker
Data Node
Task Tracker
Data Node
Task Tracker
Data Node
Task Tracker
Task Tracke
r
Job Tracke
rData Node
NameNode
Master
Slaves
ApplicationQUEUE
Batch Processi
ng
Fault tolerance
Data Node
Task Tracker
Data Node
Task Tracker
Data Node
Task Tracker
Data Node
Task Tracker
Data Node
Task Tracker
Task Tracke
r
Job Tracke
rData Node
NameNode
Master
Slaves
Tables are backed up
Easy Programming
Programmers
Where the file is located
How to manage failures
How to break computation into
pieces
How to program for scaling
Do not have toworry about
Writing scale free programs
Could focus on
Scalability cost
Processing speed
Number of computers
Linear
Why Hadoop in Cloud
1. Lowering the cost of innovation2. Procuring large scale resources quickly3. Handling Batch Workloads Efficiently4. Handling Variable Resource Requirements5. Running Closer to the Data6. Simplifying Hadoop Operations
Amazon Elastic Compute Cloud• Amazon Elastic Compute Cloud (EC2) is a central part of
Amazon.com's cloud computing platform, Amazon Web Services (AWS).
• EC2 allows users to rent virtual computers on which to run their own computer applications.
• EC2 allows scalable deployment of applications by providing a Web service through which a user can boot an Amazon Machine Image to create a virtual machine, which Amazon calls an "instance", containing any software desired.
• A user can create, launch, and terminate server instances as needed, paying by the hour for active servers, hence the term "elastic".
Instance typesAs of December 2012, the following instance types were offered:
On-demand : Pay by hour without commitment
Reserved : Rent instances with one-time payment receiving discounts on the hourly chargeReserved Instances can be purchased in three different ways: All Upfront, Partial Upfront and No Upfront.
Spot : Bid-based service (runs the jobs only if the spot price is below the bid specified by bidder—the spot price is claimed to be supply-demand based, however, research refutes this claim )
For more info : http://en.wikipedia.org/wiki/Amazon_Elastic_Compute_Cloud http://aws.amazon.com/ec2/
Features
• Operating systems• Persistent storage• Elastic IP addresses• Automated scaling• Reliability• Multiple Locations• Amazon Virtual Private Cloud• Elastic Load Balancing• High Performance Computing (HPC) Clusters• High I/O Instances• High Storage Instances• VM Import/Export
Benefits• Elastic Web-Scale Computing
• Completely Controlled• Flexible Cloud Hosting Services• Designed for use with other Amazon Web Services• Reliable• Secure [For more information on Amazon EC2 security refer to our
Amazon Web Services: Overview of Security Process document.]• Inexpensive• On-Demand Instances
Amazon EC2 FunctionalityTo use Amazon EC2, you simply:
• Select a pre-configured, template Amazon Machine Image (AMI) to get up and running immediately. Or create an AMI containing your applications, libraries, data, and associated configuration settings.
• Configure security and network access on your Amazon EC2 instance.
• Choose which instance type(s) you want, then start, terminate, and monitor as many instances of your AMI as needed, using the web service APIs or the variety of management tools provided.
• Determine whether you want to run in multiple locations, utilize static IP endpoints, or attach persistent block storage to your instances.
• Pay only for the resources that you actually consume, like instance-hours or data transfer.
Advantages of proposed system• Detection of malfunction can be handled.
• Door sensors can help to detect the status of refrigerator door.
• Sensors data can be used by third party vendors for analytics purpose.
• In large systems like Hotels, were many refrigerators are used, the detection of malfunction of any refrigerator can be easily identified.
• User will get the service at efficient level.
Companies
Conclusion
• In this presentation, we introduced a concept of Sensing as a Service (SaaS) in refrigerators, and identified unique challenges of developing SaaS cloud, which include: 1) support for various sensing applications; 2) Hadoop in cloud 3) detection of malfunction; 4) Using sensor data for analytics purpose by 3rd party vendors.
• The sensor data can be used at different levels to detect malfunction in any system.
• The traditional approach of Data analysis is changing because of data usage at community level generated by sensor.