Introduction to Apache Geode (Cork, Ireland)
-
Upload
anthony-baker -
Category
Data & Analytics
-
view
560 -
download
1
Transcript of Introduction to Apache Geode (Cork, Ireland)
![Page 1: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/1.jpg)
Introduction to Apache Geode (incubating)
Cork, September 2015
Anthony Baker (@metatype) William Markito (@william_markito)
![Page 2: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/2.jpg)
• Introduction to Geode
• Geode concepts and usage
• The Geode open source project
• StockPrediction demo
Agenda
![Page 3: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/3.jpg)
Introduction
![Page 4: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/4.jpg)
4
"…an in-memory, distributed database with strong consistency built to support low latency transactional applications at extreme scale.”
![Page 5: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/5.jpg)
History
2004 2008 2014
• Massive increase in data volumes
• Falling margins per transaction
• Increasing cost of IT maintenance
• Need for elasticity in systems
• Financial Services Providers (every major Wall Street bank)
• Department of Defense
• Real Time response needs • Time to market constraints • Need for flexible data
models across enterprise • Distributed development • Persistence + In-memory
• Global data visibility needs • Fast Ingest needs for data • Need to allow devices to
hook into enterprise data • Always on
• Largest travel Portal • Airlines • Trade clearing • Online gambling
• Largest Telcos • Large mfrers • Largest Payroll processor • Auto insurance giants • Largest rail systems on
earth
• 1000+ customers in production • Cutting edge use cases
![Page 6: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/6.jpg)
Interesting use cases
China RailwayCorporation
5,700 train stations4.5 million tickets per day20 million daily users1.4 billion page views per day40,000 visits per second
*http://pivotal.io/big-data/pivotal-gemfire
Indian Railways
7,000 stations72,000 miles of track23 million passengers daily120,000 concurrent users10,000 transactions per minute
![Page 7: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/7.jpg)
Interesting use cases
Indian RailwaysChina Railway Corporation
World: ~7,349,000,000
~36% of the world population
Population: 1,251,695,6161,401,586,609
![Page 8: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/8.jpg)
8
Application Patterns
• Caching for speed and scale ➡ Read-Through, Write-Through, Write-Behind
• As the OLTP system of record ➡ Data in-memory for low-latency, on disk for durability
• Parallel compute engine • Real-time analytics
![Page 9: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/9.jpg)
Concepts
![Page 10: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/10.jpg)
• Cache
• Region
• Member
• Client Cache
• Functions
• Listeners
Geode Concepts and Usage
![Page 11: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/11.jpg)
• Cache
• In-memory storage and management for your data
• Configurable through XML, Spring, Java API or CLI
• Collection of Region
Region
Region
Region
Cache
JVM
Concepts
![Page 12: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/12.jpg)
Concepts
• Region
• Distributed java.util.Map on steroids (Key/Value)
• Consistent API regardless of where or how data is stored
• Observable (reactive)
• Highly available, redundant on cache Member (s).
• Querying
Region
Cache
java.util.Map
JVM
Key Value
K01 May
K02 Tim
![Page 13: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/13.jpg)
Concepts
• Region
• Local, Replicated or Partitioned
• In-memory or persistent
• Redundant
• LRU
• Overflow
Region
Cache
java.util.Map
JVM
Key Value
K01 May
K02 Tim
Region
Cache
java.util.Map
JVM
Key Value
K01 May
K02 Tim
LOCAL LOCAL_HEAP_LRU LOCAL_OVERFLOW LOCAL_PERSISTENT LOCAL_PERSISTENT_OVERFLOW PARTITION PARTITION_HEAP_LRU PARTITION_OVERFLOW PARTITION_PERSISTENT PARTITION_PERSISTENT_OVERFLOW PARTITION_PROXY PARTITION_PROXY_REDUNDANT PARTITION_REDUNDANT PARTITION_REDUNDANT_HEAP_LRU PARTITION_REDUNDANT_OVERFLOW PARTITION_REDUNDANT_PERSISTENT PARTITION_REDUNDANT_PERSISTENT_OVERFLOW REPLICATE REPLICATE_HEAP_LRU REPLICATE_OVERFLOW REPLICATE_PERSISTENT REPLICATE_PERSISTENT_OVERFLOW REPLICATE_PROXY
![Page 14: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/14.jpg)
• Persistent Regions
• Durability
• WAL for efficient writing
• Consistent recovery
• Compaction
Concepts
Modify k1->v5
Create k6->v6
Create k2->v2
Create k4->v4 Oplog2.crf
Member 1
Modify k4->v7 Oplog3.crf
Put k4->v7
Region
Cache
java.util.Map
JVM
Key Value
K01 May
K02 Tim
Region
Cache
java.util.Map
JVM
Key Value
K01 May
K02 Tim
Server 1 Server N
![Page 15: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/15.jpg)
• Member
• A process that has a connection to the system
• A process that has created a cache
• Embeddable within your application
Concepts
Client
Locator
Server
![Page 16: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/16.jpg)
Concepts
• Client cache
• A process connected to the Geode server(s)
• Can have a local copy of the data
• Can be notified about events on the servers
Application
GemFire Server
Region
Region
Region Client Cache
![Page 17: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/17.jpg)
Concepts
• Functions
• Used for distributed concurrent processing (Map/Reduce, stored procedure)
• Highly available
• Data oriented
• Member oriented
Submit (f1)
f1 , f2 , … fn
Execute Functions
![Page 18: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/18.jpg)
Concepts
• Functions
Server
Server
FunctionService.onRegion.withFilter.execute ResultCollector.getResult
Server Distributed System
execute
Server
Server
6
1
result
execute
execute
result result
2
5
3
4 3 4
Server
Partitioned Region Data Store - X
Partitioned Region Data Store - Y
Partitioned Region Data Store - Z
Partitioned Region Data Accessor
Partitioned Region Data Accessor
filter = Keys X, Y Client Region
![Page 19: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/19.jpg)
Concepts
• Listeners
• CacheWriter / CacheListener
• AsyncEventListener (queue / batch)
• Parallel or Serial
• Conflation
![Page 20: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/20.jpg)
Core Beliefs
Performance
Consistency
Resiliency
![Page 21: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/21.jpg)
Why Apache Geode?
© Copyright 2014 Pivotal. All rights reserved.
Pivotal GemFire High Availability and Fault Tolerance in 6 acts
Failing data copies are replaced transparently
Data is replicated to other clusters and sites (WAN)
Network segmentations are identified and fixed automatically
Client and cluster disconnections are handled gracefully
Data is persisted on local disk for ultimate durability
“split brain”
Failed function executions are restarted automatically
restart
![Page 22: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/22.jpg)
• Minimize copying
• Minimize contention points
• Flexible consistency model
• Partitioning and parallelism
• Avoid disk seeks
• Automated benchmarks
What makes it fast?
![Page 23: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/23.jpg)
Benchmarks and Testing
0
2
4
6
8
10
12
14
16
18
0
1
2
3
4
5
6
2 4 6 8 10
Spee
dup
Server(Hosts
speedup
latency4(ms)
CPU4%
![Page 24: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/24.jpg)
The Geode Project
![Page 25: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/25.jpg)
Why Open Source? Why ASF?
• Open source is fundamentally changing software buying patterns
• Customers get transparency and co-development of features
• It’s the community that matters
• ASF provides a framework for open source
![Page 26: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/26.jpg)
Geode Will Be a Significant Apache Project
• 1M+ LOC, over a 1000 person years invested into cutting edge R&D
• Thousands of production customers in very demanding verticals
• Cutting edge use cases that have shaped product thinking
• A core technology team that has stayed together since founding
• Performance differentiators that are baked into every aspect of the product
![Page 27: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/27.jpg)
Geode versus GemFire
• Geode is a project supported by the OSS community
• GemFire is product from Pivotal, based on Geode source
• We donated everything but the kitchen sink*
• Development process follows “The Apache Way”
* Multi-site WAN replication, continuous queries, and native (C/C++/.NET) client
![Page 28: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/28.jpg)
"Talk is cheap, show me the code"
• Clone & Build
git clone https://github.com/apache/incubator-‐geode cd incubator-‐geode./gradlew build -‐Dskip.tests=true
• Start a server
cd gemfire-‐assembly/build/install/apache-‐geode ./bin/gfsh gfsh> start locator -‐-‐name=locator gfsh> start server -‐-‐name=server gfsh> create region -‐-‐name=myRegion -‐-‐type=REPLICATE
![Page 29: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/29.jpg)
29
&
![Page 30: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/30.jpg)
Stock Predictions with
Apache Geode, Spark, and SpringXD
![Page 31: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/31.jpg)
• RDD • Dataframe • Driver • Worker
Quick intro to Apache Spark
"An RDD in Spark is simply an immutable distributed collection of objects. Each RDD is split into multiple partitions, which may be computed on different nodes of the cluster. RDDs can contain any type of Python, Java, or Scala objects, including user-defined classes."
![Page 32: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/32.jpg)
• RDD • Dataframe • Driver • Worker
Quick intro to Apache Spark
“A dataframe is a distributed collection of rows organized into named columns. An abstraction for selecting, filtering and plotting structured data (pandas), previously known as SchemaRDD."
![Page 33: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/33.jpg)
• RDD • Dataframe • Driver • Worker
Quick intro to Apache Spark
![Page 34: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/34.jpg)
Live Data
Apache Geode / GemFire1- Live data is ingested into the grid
2 - Trained ML model compares new data to historical patterns
3 - Results are pushed immediately to deployed applications
Machine Learning model
4 - Re-training is triggered, updating the model with the latest historical data
Spring XD
Spring XD
Data Temperature
Hot
Warm
![Page 35: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/35.jpg)
Machine Learning Concepts
medium avg (x+1)
relative strength (x)
medium avg (x)
price(x)
Machine Learning Model (e.g. Linear Regression)
![Page 36: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/36.jpg)
medium avg (x+1)
relative strength (x)
medium avg (x)
price(x)
Machine Learning Model (e.g. Linear Regression)
Features Label
![Page 37: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/37.jpg)
Transform Sink
SpringXD
ExtensibleOpen-SourceFault-TolerantHorizontally ScalableCloud-Native
Machine Learning
Enrich Filter
Split
Dashboard
Indicators
1
2
Predict
3
Real data
Simulator
/Stocks
/TechIndicators
/Predictions
![Page 38: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/38.jpg)
• Off-heap memory storage
• HDFS persistence
• Lucene indexes
• Spark connector
• Cloud Foundry service
• Distributed transactions
…and other ideas from the Geode community!
Roadmap
![Page 39: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/39.jpg)
How to Get Involved
• http://geode.incubator.apache.org
• Join the mailing lists; ask a question, answer a question, learn
[email protected] [email protected]
• File a bug in JIRA
• Update the wiki, website, or documentation
• Create example applications
• Use it in your project! We need you!
![Page 40: Introduction to Apache Geode (Cork, Ireland)](https://reader031.fdocuments.in/reader031/viewer/2022021509/5879e9c11a28ab15288b66bd/html5/thumbnails/40.jpg)
Questions?
Thank you!