infoShare 2014: Mariusz Róg, Big Data w praktyce -- jak efektywnie przetwarzać wielkie zbiory...
-
Upload
infoshare -
Category
Technology
-
view
153 -
download
2
Transcript of infoShare 2014: Mariusz Róg, Big Data w praktyce -- jak efektywnie przetwarzać wielkie zbiory...
© 2013 Acxiom Corporation. All Rights Reserved. © 2013 Acxiom Corporation. All Rights Reserved.
bb
1
Better connections. Better results.
© 2013 Acxiom Corporation. All Rights Reserved. © 2013 Acxiom Corporation. All Rights Reserved.
How to effectively process large data sets
Big Data in practice
Mariusz Róg, Team Leader of Engineering, Acxiom Global Service Center, Poland
© 2013 Acxiom Corporation. All Rights Reserved.3
The talk
The solution
Who we are
The problem
© 2013 Acxiom Corporation. All Rights Reserved.4
Who we are ?
„One of the biggest companies you've never heard of.”
source: http://en.wikipedia.org/wiki/Acxiom
© 2013 Acxiom Corporation. All Rights Reserved.
Who we work for?
• 7 of the top 10 credit card issuers• 7 of the top 10 retail banks• 7 of the top 10 retailers• 6 of the top 10 telecom / media companies• 7 of the top 10 automotive manufacturers• 7 of the top 10 U.S. hotels• 5 of the top 10 technology companies• 3 of the top 5 brokerage firms• 3 of the top 5 pharmaceutical manufacturers• 8 of the top 10 insurance providers• 7 of the top 10 hotels• 3 of the top 5 domestic airlines• 4 of the top 5 gaming companies
The trademarks and registered trademarks on this page are the property of their respective owners. Stats updated as of 7/8/13.
© 2013 Acxiom Corporation. All Rights Reserved.
Where we are?
Note: Acxiom also delivers solutions in many geographies where it does not have a physical presence.
ArgentinaAustraliaBahrainBangladeshBrazilCanadaChileChina
(including Hong Kong and Taiwan)
ColombiaEgyptFranceGermanyIndiaIndonesiaIsraelJapanJordanKoreaKuwait
LebanonMalaysiaMexicoMongoliaNew ZealandOmanPhilippinesPolandQatarRussiaSaudi ArabiaSingaporeSouth AfricaThailandUnited Arab EmiratesUnited KingdomUnited StatesVenezuelaVietnam
Acxiom provides data, processing, consulting, SMS / digital and / or other services to more than 7,500 recurring clients around the globe in approximately 50 countries and 20 languages.
Offices located in these markets
Services available in these markets
Sample of Countries
© 2013 Acxiom Corporation. All Rights Reserved.7
What we do?
We do data!
Marketing and information management services
SaaS development
Analitics
Customers developement services
Technical and marketing consultingITO support and consulting
Technology R&D
Big Data managment PII and Web security
PII and Healthcare world wide compilance
Forrester Research named Acxiom one of the largest database marketing services and technology providers in the world
© 2013 Acxiom Corporation. All Rights Reserved.8
We transform.
Formerly Microsoft, aQuantive &
Razorfish
Formerly MySpace,
MTV & AOL
Formerly CFO Amazon, NBC & Electronic Arts
Formerly Architect of Google Analytics
Dennis D. Self
CIO, SVP
Formerly Electronic Arts
and HP
© 2013 Acxiom Corporation. All Rights Reserved.
The Acxiom Audience operating system
© 2013 Acxiom Corporation. All Rights Reserved.10
Audience Propensities
© 2013 Acxiom Corporation. All Rights Reserved.11
Regression Model
© 2013 Acxiom Corporation. All Rights Reserved.12
The talk
The solution
Who we are
The problem
© 2013 Acxiom Corporation. All Rights Reserved.13
The flow
Age
f(x,y,z,...)f(x,y,z,...)
© 2013 Acxiom Corporation. All Rights Reserved.14
The problem
• 21 dimentions (avg inputs)
For example
• 3823 regression models
• 31B (avg size)
• 242271350 people (242M)
© 2013 Acxiom Corporation. All Rights Reserved.15
The BIG problem
3823 x 21 x 31 x 242271350 = 602 958 394 553 550 B
376 cores47 nodes @ 8x3 Ghz and 32 GB
~548 TB
More than a week!
© 2013 Acxiom Corporation. All Rights Reserved.16
The talk
The solution
Who we are
The problem
© 2013 Acxiom Corporation. All Rights Reserved.17
3 Steps
µservices
µµ
µ
µ
µµ
µµ
µ
µ
µ
µ
virtual infrastructure
the system
© 2013 Acxiom Corporation. All Rights Reserved.18
Can’t say much...
© 2013 Acxiom Corporation. All Rights Reserved.19
The foundation
We needed a communication framework
• It’s needed to be Fast
• It’s needed to be Stable
• It’s needed to be Concurrent
© 2013 Acxiom Corporation. All Rights Reserved.20
The ØMQ
© 2013 Acxiom Corporation. All Rights Reserved.21
It is not a Messaging Queue or ESB
© 2013 Acxiom Corporation. All Rights Reserved.22
Confused?
© 2013 Acxiom Corporation. All Rights Reserved.23
Authors
iMatixReal time financial systems
© 2013 Acxiom Corporation. All Rights Reserved.24
ØMQ
Source: http://zguide2.zeromq.org/
© 2013 Acxiom Corporation. All Rights Reserved.25
Confused again?
© 2013 Acxiom Corporation. All Rights Reserved.26
What it is ?
• Ø latency communication framework• Queue based framework• Concurrency framework• Easy and intuitive API• LGPL
Example…
© 2013 Acxiom Corporation. All Rights Reserved.27
Clientprivate static final ZMQ.Context zmqContext = ZMQ.context(zmqThreadCount);
ZMQ.Socket socket = zmqContext.socket(ZMQ.REQ);socket.setSendTimeOut(sendTimeout);socket.connect("tcp://*:5555");
// ZMQ_MSG_FLAGS 0 = blocking socket typeboolean messageSent = socket.send(msg, ZMQ_MSG_FLAGS);
if(!messageSent){ LOG.error("Error receiving response for {}", this);
}
byte[] responseMsg = socket.recv(ZMQ_MSG_FLAGS);
if(responseMsg == null){ LOG.error("Error receiving response for {}", this);
}
© 2013 Acxiom Corporation. All Rights Reserved.28
Serverprivate static final ZMQ.Context zmqContext = ZMQ.context(zmqThreadCount);...ZMQ.Socket socket = zmqContext.socket(ZMQ.REQ);socket.setReceiveTimeOut(receiveTimeout);
socket.bind("tcp://*:5555");socket.bind("inproc://workers");
while (Thread.currentThread().isInterrupted() == false) {byte[] recivedBytes = socket.recv(0);if(recivedBytes == null){
LOG.error("Error receiving response for {}", this);}
boolean messageSent = socket.send(msg, ZMQ_MSG_FLAGS); if(!messageSent){
LOG.error("Error receiving response for {}", this);}
}
© 2013 Acxiom Corporation. All Rights Reserved.29
ØMQ Message
• Atomic• Can be Multipart• Source/Dest• Can be Routed/Proxed/Analized• Data agnostic
preffered Google Protobuf
© 2013 Acxiom Corporation. All Rights Reserved.30
ØMQ Socket Types
• Unicast• TCP („tcp://localhost:5555”)• IPC („ipc://storeandforward”)• INPROC („inproc://emailThread”)
• Multicast• PGM/EPGM („epgm://192.168.1.1:5555”)
© 2013 Acxiom Corporation. All Rights Reserved.31
Basic Patterns
source: http://zguide.zeromq.org/
© 2013 Acxiom Corporation. All Rights Reserved.32
Advanced Patterns
source: http://zguide.zeromq.org/
© 2013 Acxiom Corporation. All Rights Reserved.33
Growing
zmq - 0MQ lightweight messaging kernelzmq_bind - accept connections on a socketzmq_close - close 0MQ socketzmq_connect - connect a socketzmq_cpp - interface between 0MQ and C++ applicationszmq_device - start built-in 0MQ devicezmq_pgm - 0MQ reliable multicast transport using PGMzmq_errno - retrieve value of errno for the calling threadzmq_getsockopt - get 0MQ socket optionszmq_init - initialise 0MQ contextzmq_inproc - 0MQ local in-process (inter-thread) communication transportzmq_ipc - 0MQ local inter-process communication transportzmq_msg_close - release 0MQ messagezmq_msg_copy - copy content of a message to another messagezmq_msg_data - retrieve pointer to message contentzmq_msg_init_data - initialise 0MQ message from a supplied bufferzmq_msg_init_size - initialise 0MQ message of a specified sizezmq_msg_init - initialise empty 0MQ messagezmq_msg_move - move content of a message to another messagezmq_msg_size - retrieve message content size in byteszmq_pgm - 0MQ reliable multicast transport using PGMzmq_poll - input/output multiplexingzmq_recv - receive a message from a socketzmq_send - send a message on a socketzmq_setsockopt - set 0MQ socket optionszmq_socket - create 0MQ socketzmq_strerror - get 0MQ error message stringzmq_tcp - 0MQ unicast transport using TCPzmq_term - terminate 0MQ contextzmq_version - report 0MQ library version
zmq - 0MQ lightweight messaging kernelzmq_bind - accept incoming connections on a socketzmq_close - close 0MQ socketzmq_connect - create outgoing connection from socketzmq_ctx_destroy - terminate a 0MQ contextzmq_ctx_get - get context optionszmq_ctx_new - create new 0MQ contextzmq_ctx_set - set context optionszmq_ctx_shutdown - shutdown a 0MQ contextzmq_ctx_term - destroy a 0MQ contextzmq_curve_keypair - generate a new CURVE keypairzmq_curve - secure authentication and confidentialityzmq_disconnect - Disconnect a socketzmq_pgm - 0MQ reliable multicast transport using PGMzmq_errno - retrieve value of errno for the calling threadzmq_getsockopt - get 0MQ socket optionszmq_init - initialise 0MQ contextzmq_inproc - 0MQ local in-process (inter-thread) communication transportzmq_ipc - 0MQ local inter-process communication transportzmq_msg_close - release 0MQ messagezmq_msg_copy - copy content of a message to another messagezmq_msg_data - retrieve pointer to message contentzmq_msg_get - get message propertyzmq_msg_init_data - initialise 0MQ message from a supplied bufferzmq_msg_init_size - initialise 0MQ message of a specified sizezmq_msg_init - initialise empty 0MQ messagezmq_msg_more - indicate if there are more message parts to receivezmq_msg_move - move content of a message to another messagezmq_msg_recv - receive a message part from a socketzmq_msg_send - send a message part on a socketzmq_msg_set - set message propertyzmq_msg_size - retrieve message content size in byteszmq_null - no security or confidentialityzmq_pgm - 0MQ reliable multicast transport using PGMzmq_plain - clear-text authenticationzmq_poll - input/output multiplexingzmq_proxy_steerable - start built-in 0MQ proxy with PAUSE/RESUME/TERMINATE control flowzmq_proxy - start built-in 0MQ proxyzmq_recvmsg - receive a message part from a socketzmq_recv - receive a message part from a socketzmq_send_const - send a constant-memory message part on a socketzmq_sendmsg - send a message part on a socketzmq_send - send a message part on a socketzmq_setsockopt - set 0MQ socket optionszmq_socket_monitor - register a monitoring callbackzmq_socket - create 0MQ socketzmq_strerror - get 0MQ error message stringzmq_tcp - 0MQ unicast transport using TCPzmq_term - terminate 0MQ contextzmq_unbind - Stop accepting connections on a socketzmq_version - report 0MQ library versionzmq_z85_decode - decode a binary key from Z85 printable textzmq_z85_encode - encode a binary key as Z85 printable text
ØMQ v 2.2 ØMQ v 4.0
© 2013 Acxiom Corporation. All Rights Reserved.34
Cross platform
• NetMQhttps://github.com/zeromq/netmq
Haxe
C++
C# ClojureCL Erlang
F#
Felix
Go
Haskell
Java
Lua
Node.js Objective-C Perl
PHP
Python
Racket
ooc
Basic
Ada
Tcl
Scala
Ruby
Q
• JeroMQhttps://github.com/zeromq/jeromq
© 2013 Acxiom Corporation. All Rights Reserved.35
Visit us on
https://developer.myacxiom.com/
http://acxiom.com/about-acxiom/careers/
© 2013 Acxiom Corporation. All Rights Reserved. © 2013 Acxiom Corporation. All Rights Reserved.
Thank You!