Silicon valley nosql meetup april 2012

53
Maximize your Data with Real-time Big Data Analytics using NOSQL Technologies. Silicon Valley NOSQL Meetup Group Thursday, April 26, 2012 – Brian Clark 05/18/2022 © Objectivity Inc 2012 1

description

Join Objectivity, Inc.’s VP of Product Management, Brian Clark, in a discussion of the latest trends in Big Data Analytics, defining what is Big Data and understanding how to maximize your existing architectures by utilizing NOSQL technologies to improve functionality and provide real-time results. There will be a focus on relationship analytics as well as an introduction to NOSQL data stores, object and graph databases, such as the architecture behind Objectivity/DB and InfiniteGraph.

Transcript of Silicon valley nosql meetup april 2012

Page 1: Silicon valley nosql meetup  april 2012

Maximize your Data with Real-time Big Data Analytics using NOSQL Technologies.

Silicon Valley NOSQL Meetup Group

Thursday, April 26, 2012 – Brian Clark

04/10/2023 © Objectivity Inc 2012 1

Page 2: Silicon valley nosql meetup  april 2012

Agenda

• About me!

• Objectivity, Inc.

• NOSQL

• Big Data

• Use Cases

• InfiniteGraph and Objectivity/DB Overview

• Demo

• Q & A

04/10/2023 © Objectivity Inc 2012 2

Page 3: Silicon valley nosql meetup  april 2012

School - The 3 R’s

•Reading•wRiting•aRithmetic•I knew I was in trouble!

04/10/2023 © Objectivity Inc 2012 3

Page 4: Silicon valley nosql meetup  april 2012

University - The 3 B’s

•Bands (Friday night Hop)•Booze•Birds•I knew I was in trouble!

• = a job as a mainframe computer operator

04/10/2023 © Objectivity Inc 2012 4

Page 5: Silicon valley nosql meetup  april 2012

A Brief History of Computing

04/10/2023 © Objectivity Inc 2012 5

Page 6: Silicon valley nosql meetup  april 2012

A Brief History of Computing

04/10/2023 © Objectivity Inc 2012 6

Page 7: Silicon valley nosql meetup  april 2012

A Brief History of Computing

04/10/2023 © Objectivity Inc 2012 7

Page 8: Silicon valley nosql meetup  april 2012

Performance with

Complexity and Scalability

1990’s

Physical independence

SQL

1970’s

Many-to-many relationships,

but still too rigid

1960’s

Physical pointers

1960’s

A Brief History of Databases

Hierarchical Model

Network Model

Relational Model

Object-Oriented

04/10/2023 © Objectivity Inc 2012 8

Page 9: Silicon valley nosql meetup  april 2012

Objectivity, Inc.

• The world today is about big data, distributed objects and connections between them.

• Objectivity/DB™ Distributed big data and object management.

• InfiniteGraph™ Connects the dots on a global scale.

04/10/2023 © Objectivity Inc 2012 9

Page 10: Silicon valley nosql meetup  april 2012

NOSQL

Page 11: Silicon valley nosql meetup  april 2012

InfiniteGraph in the “NOSQL” Market

04/10/2023 © Objectivity Inc 2012 11

Page 12: Silicon valley nosql meetup  april 2012

The Right Tool for the Right Job (1 of 2)

First, a truism:• The closer the data model matches the data store

structure, the faster queries can be executed, the higher the scalability, and the easier it is to write applications.

• One size doesn’t fit all, and multiple tools might join forces to fully solve a problem.

Relational Databases• Data represented by rows (records) and columns

(attributes); a schema defines the columns and their distribution amongst tables.

• Versatile, can solve most data storage and access problems; can solve all if scale is limited.

• Good for producing lists of data based on a value in that data, such as a list of customers with unfilled orders.

Hadoop/MapReduce• General purpose parallel processing and storing

facility for massive amounts of data.• Data store is a file system, not a database.• Good for problems that can be broken into many

small parts and processed independently, and done so offline, such as the ETL (extract, transform, load) process for preparing and moving captured data into a data warehouse.

Object Databases• Data represented by objects, which are groups of

attributes; schema defines the attributes, which may include pointers (relationships) to other objects

• Ability to store and retrieve whole objects makes access to set of data very fast; tighter connection to object-oriented programming application reduces complexity.

• Good for accessing massive amounts of data about related items, such as a user’s account history.

04/10/2023 © Objectivity Inc 2012 12

Page 13: Silicon valley nosql meetup  april 2012

The Right Tool for the Right Job (2 of 2)Key-Value Databases•Rows and columns like a relational database, but only 2 columns, making it an indexing system (find a value based on the key) •No schema required, so the value could be anything, such as an object or a pointer to data in another data store•Very fast for indexing, such as looking up a user’s shopping cart on an ecommerce site.

Column Family Databases•Rows and columns like a relational database, but storage on disk is organized so as to make attributes (columns) highly accessible without accessing the whole of the associated record (row).•Results in very fast actions regarding attributes, such as calculating average age

Document Databases•Similar to object database, but without the need to predefine an object’s attributes (i.e., no schema required).•Provides flexibility to store new types or unanticipated sizes of data/objects during operation, on the fly, such as event logging where the data format is unpredictable and not just simple text (e.g., video).

Graph Databases•Similar to object database, but the objects and relationships between them are all objects with their own respective sets of attributes.

•Enables very fast queries when the value of the data is in the relationships, i.e. relationships between people/items•Are two people/items related (even if separated by several levels of relationship)?

•Where the relationships represent costs, what is the optimal combination of groups of people/items?

04/10/2023 © Objectivity Inc 2012 13

Page 14: Silicon valley nosql meetup  april 2012

Big Data

Page 15: Silicon valley nosql meetup  april 2012

Big Data

• Volume

• Velocity

• Variety

= VALUE!

Requires new ways of thinking – distributed data and processing

04/10/2023 © Objectivity Inc 2012 15

Page 16: Silicon valley nosql meetup  april 2012

Parallel Processing and Storage

Apache HADOOP

• Map/Reduce– Distributed processing.

• HDFS– Distributed file system.

• HBase– Distributed storage for

large tables.

• Cassandra– Multi-master database with

no single point of failure.

InfiniteGraph• Distributed processing

- Peer-to-peer servers and clients anywhere in the network.

• Distributed data- Federation of databases

anywhere in the network.

• Standard filesystem- Random I/O for fast navigational

queries.

• Single logical view of all data in the federation- Any client anywhere can access

server anywhere.

04/10/2023 © Objectivity Inc 2012 16

Page 17: Silicon valley nosql meetup  april 2012

04/10/2023 17

Common Big Data Architecture

RDBMS GraphDB

DocumentDB

HadoopBigTable

Key-ValueStores

DataWarehouse

Data Aggregation & Application Analytics

ColumnStores

Commodity Linux Clusters or High Performance Compute platforms

Structured Semi-structured Un-structured

ObjectDB

© Objectivity Inc 2012

Page 18: Silicon valley nosql meetup  april 2012

Common Big Data Architecture

Visualization and Analytics

toolsHadoopRDBMS Other

storesFront End Processing Raw Data

The strategic competitors are all moving in this direction for Big Data

ObserveOrientDecideAct

© Objectivity Inc 201204/10/2023 18

Page 19: Silicon valley nosql meetup  april 2012

Big Data Analytics Solutions

Data Analytics Applications

Greenplum HadoopGreenplum

Greenplum Data

Integration Accelerator

Raw Data

Infosphere BigInsights

IBMHadoopDB2 Infosphere

WarehouseFront End Processing Raw Data

Oracle In-Database Analytics

Cloudera Hadoop

Oracle 11g

Oracle NoSQL

Oracle Data

IntegratorRaw Data

Autonomy Vertica Database

Front End Processing Raw Data

EMC

IBM

Oracle

HP

© Objectivity Inc 201204/10/2023 19

Page 20: Silicon valley nosql meetup  april 2012

Big Data Landscape

• All current solutions have the same basic architecture model.

• None of the current solutions have a way to store connections between entities in the different silos.

– Analytics today focuses on the nodes of data (quantifiable occurrences) rather than the relevant connections or edges between the nodes (qualitative occurrences).

• Objectivity has a proven way to efficiently store, manage and query the relationships and connections between data.

© Objectivity Inc 201204/10/2023 20

Page 21: Silicon valley nosql meetup  april 2012

Disruptive Big Data New Architecture

Visualization and Analytics

tools

HadoopRDBMS Other stores

Front End Processing Raw Data

The Proven Connection StoreObjectivity/DB and/or InfiniteGraph Raw Data

Represents data nodes

Represents bidirectional relationships/connections between data.

© Objectivity Inc 201204/10/2023 21

Page 22: Silicon valley nosql meetup  april 2012

Why We’re Different

• Relational databases are not optimized to understand objects or connections.

• Objectivity/DB™ is all about objects and relationships.

• InfiniteGraph™ is all about the connections as first class citizens.

04/10/2023 © Objectivity Inc 2012 22

Page 23: Silicon valley nosql meetup  april 2012

Use Cases & Challenges

Page 24: Silicon valley nosql meetup  april 2012

Relationships are everywhere

CRM,

Sales & Marketing

Networ

k Mgmt, Telecom

Intelligen

ce (Government&

Business

)

FinanceHealthca

re

Research

: Genomic

s

Social

Networks

LogisticsMaster Data

Management

PLM (Product Lifecycle Mgmt)

04/10/2023 © Objectivity Inc 2012 24

Page 25: Silicon valley nosql meetup  april 2012

Financial Services

Fraud Detection

– Problem: Detect patterns of fraudulent activities before damage is done

– Solution: Real-time identification of inconsistencies enables instantaneous notification to security systems

– Results:• Improved banking security and

client confidence• Reduction of lost revenues• Improved efficiency allows fraud-

detection teams to develop and deploy additional services

04/10/2023 © Objectivity Inc 2012 25

Page 26: Silicon valley nosql meetup  april 2012

Application Development

The “Facebook” For Education

– Problem: Develop system capable of handling exponential user- base growth

– Solution: Leverage InfiniteGraph’s scalability and performance to support real-time relationship information between all members and to act as primary DB for all topics and users

– Results: Complete social networking site allowing global users to access courses from leading institutions & to collaborate effectively with other students and teachers

04/10/2023 © Objectivity Inc 2012 26

Page 27: Silicon valley nosql meetup  april 2012

Use Case – Confidential Ad Placement Network

• Ad placement on smart phone based on user profile and location data generated by opt-in application (e.g., a free game).

• Location data captured and distilled by Cassandra (key-value/column family hybrid database).

• Locations matched with geospatial data to refine user interests.

• As ad placement orders arrive, InfiniteGraph matches groups of users with ads, maximizing relevance for the user, value for the advertiser and revenue for the ad placement company.

04/10/2023 © Objectivity Inc 2012 27

Page 28: Silicon valley nosql meetup  april 2012

Government

Broad Area Maritime Surveillance UAS

– Problem: Monitor potential threats across open oceans and remote areas on a 24/7 basis

– Solution: Use Objectivity/db to develop a system for unmanned aircraft to capture and transmit real-time data of any type for analysis and sharing

– Results: A federated view of maritime surveillance and continuous reconnaissance capability for mission, reconnaissance, and communications assessments

04/10/2023 © Objectivity Inc 2012 28

Page 29: Silicon valley nosql meetup  april 2012

Healthcare

Bring together doctors, patients, and their records

– Problem: As patients move between doctors, manage their records globally to better capture and understand symptoms, causes, and interdependencies and to improve diagnoses

– Solution: Create a database using Objectivity/db and InfiniteGraph capable of managing real-time entries of patient visits, symptoms, diagnoses, reactions to medications, and progress

– Results:

• Improved times to more accurate diagnoses

• Creation of a knowledge base of similar medical cases

• Increase success rates of initial prescriptions based on historical recommendations

04/10/2023 © Objectivity Inc 2012 29

Page 30: Silicon valley nosql meetup  april 2012

30

Team: Objectivity, L-3, and Lockheed U.S. Air Force’s Network Centric Collaborative Targeting (NCCT) U.S. Navy’s Cooperative Engagement Capability (CEC) system.

Network Centric Collaborative Targeting

04/10/2023 30© Objectivity Inc 2012

Page 31: Silicon valley nosql meetup  april 2012

NCCT - Customer Challenge

Time sensitive targets were hard to find Sensors operated as independent systems The performance of each individual sensor is very good ( great

ears and eyes) but collectively lack a central nervous system Mountains of Data are coming from sensors Existing sensors alone cannot reliably find highly mobile, moving

and/or spoofing targets

Silo’d systems with individual reports did not provide

solutions

04/10/2023 31© Objectivity Inc 2012

Page 32: Silicon valley nosql meetup  april 2012

NCCT - Technical Solution Architecture

Company Confidential

1. Build a distributed systems that could support multi-agency platform requirements

2. Collect data from any number of high volume sources

3. Provide a data architecture that supported the need to correlate and fuse data collection for a single view of the targets

4. Support a near real-time data reporting C4ISR system

04/10/2023 32© Objectivity Inc 2012

Page 33: Silicon valley nosql meetup  april 2012

Intelligence - Customer Need

Deliver all the possible connections between them in seconds

Finding the links between callers

Collect 400,000,000 phone calls, plus address, emails, meetings….

04/10/2023 33© Objectivity Inc 2012

Page 34: Silicon valley nosql meetup  april 2012

Intelligence Problem - Performance

With a relational product: Initial attempts to traverse links across the database literally shut

down the server.

After much server and database optimization a process could be run on a single query and would produce a result over a 48 hour period.

Results were unacceptable…..

With Objectivity: The many-to-many data application was an excellent fit for Objectivity.

We then developed a proof-of-concept that delivered showing 5-6 degrees of separation within about 1 minute, running on a laptop computer

04/10/2023 34© Objectivity Inc 2012

Page 35: Silicon valley nosql meetup  april 2012

InfiniteGraph & Objectivity/DB Technical Overview

Page 36: Silicon valley nosql meetup  april 2012

What is a graph database?

• Optimized around data relationships– Relationships as first class citizens– Super fast traversal between entities– Rich/flexible annotation of connections

• Small focused API (typically not SQL)– Natively work with concepts of Vertex/Edge– SQL has no concept of “navigation”

• Graphs grow quickly e.g.– Billions of phone calls / day in US– Emails, social media events, IP Traffic– Financial transactions

• Some analytics require navigation of large sections of the graph• Each step (often) depends on the last• Must distribute data and go parallel

04/10/2023 © Objectivity Inc 2012 36

Page 37: Silicon valley nosql meetup  april 2012

Database Data Representation

• Traditional databases are good at recording things, not events or relationships

04/10/2023 © Objectivity Inc 2012 37

Meetings

P1 Place TimeP2Alice Denver 5-27-10Bob

Calls

From Time DurationToBob 13:20 25CarlosBob 17:10 15Charlie

Payments

From Date AmountToCarlos 5-12-10 100000Charlie

Met5-27-10Alice

Called13:20Bob

Paid100000Carlos

Charlie

Called17:10

Rows/Columns/Tables Relationship/Graph Optimized

Page 38: Silicon valley nosql meetup  april 2012

Viewing the Data

04/10/2023 © Objectivity Inc 2012 38

The InfiniteGraph Visualizer will need this name to display the contents of the graph database.

Page 39: Silicon valley nosql meetup  april 2012

InfiniteGraph™

• Connects the dots on a global scale.

• InfiniteGraph™ finds connections in big data.

04/10/2023 © Objectivity Inc 2012 39

Page 40: Silicon valley nosql meetup  april 2012

Find Answers Faster with InfiniteGraph™ Distributed Graph Database

04/10/2023 © Objectivity Inc 2012 40

Page 41: Silicon valley nosql meetup  april 2012

• Supports large scale and distributed systems.

• Proven technology and deployments.

• Flexible and Easy: • Distributed and cloud ready, Java on interoperable platforms, integrates

with most other data stores, supports ACID to flexible modes.

InfiniteGraph’s Unique Advantages

04/10/2023 © Objectivity Inc 2012 41

Page 42: Silicon valley nosql meetup  april 2012

InfiniteGraph Basic Architecture

InfiniteGraph - Core/API

ConfigurationNavigation Execution

Management Extensions

BlueprintsUser Apps

Distributed Object and Relationship Persistence Layer

Session / TX ManagementPlacement

04/10/2023 © Objectivity Inc 2012 42

Page 43: Silicon valley nosql meetup  april 2012

InfiniteGraph Features

• Distributed parallel ingest.

• Flexible distributed storage management.

• Node naming and indexing for fast lookup.

• User controlled navigational queries – using node and edge filters.

• Navigator plug-in architecture for sharing plug-ins with the visualizer.

• InfiniteGraph Visualizer.

• Blueprints support via Gremlin

04/10/2023 © Objectivity Inc 2012 43

Page 44: Silicon valley nosql meetup  april 2012

© Objectivity Inc 2012 44

Objectivity/DB Basic Architecture

Java API

User Application

C++ Public API

ULBPython API

C#/.NET

I/O Manager

Objy Kernel

Lock Server Page Server(AMS) Query Server

04/10/2023

Page 45: Silicon valley nosql meetup  april 2012

© Objectivity Inc 2012 45

Distributed Data /ProcessingDistributed Federated Persistent Store

Network

Federated Data Management

Single Logical ViewAll clients and servers see all data.

Distributed Data Management

Scale OutScale Out

SAN

04/10/2023

Page 46: Silicon valley nosql meetup  april 2012

Distributed Data Architecture

04/10/2023 46

#21538 - 1874 - 9638 - 164

Container Slot

PageDatabase

Federation(schema &

catalog)

• 1,000’s trillions of unique objects• 1,000’s petabytes storage• Logical/physical indirection at every

segment• Resolving ID fast regardless of number of

objects

64 Bit OID (Object ID)

© Objectivity Inc 2012

Database

Container Container

64K

Container

64K

Container

Page 47: Silicon valley nosql meetup  april 2012

47

Distributed Processing Architecture

ClientSimple, Distributed Servers

ApplicationObjectivity/DB

Cache

Lock ServersLock Servers

Data ServersData Servers

Data ServersQuery Agents

Put the data and processing where it’s needed

© Objectivity Inc 201204/10/2023

Page 48: Silicon valley nosql meetup  april 2012

© Objectivity Inc 2012 48

Flexibility – language interoperability

A B C D FE

Java App C++ App C# App Python App

Objectivity/DB Objectivity/DB Objectivity/DB Objectivity/DB

04/10/2023

Page 50: Silicon valley nosql meetup  april 2012

04/10/2023 © Objectivity Inc 2012 50

InfiniteGraph™ - Link Hunter demonstration

Page 51: Silicon valley nosql meetup  april 2012

Comprehensive Online Resources

InfiniteGraph.com(main site, content

and messaging)

Download InfiniteGraph

Product Documentation

InfiniteGraph Developer WikiGoogle Group for

Developers

Our Blog

04/10/2023 © Objectivity Inc 2012 51

Page 52: Silicon valley nosql meetup  april 2012

Company Snapshot

52

Customers

Products

Corporate

Financials & Ownership

• Established in 1988

• Headquartered in Sunnyvale, California

• NOSQL platform for managing and discovering relationships between complex data

• Objectivity/DB™: Object-oriented data management system that manages localized, centralized or distributed databases

• InfiniteGraph™: New massively scalable graph database that enables organizations to find, store, and exploit the relationships hidden in their data

• Deeply embedded in nearly 90 enterprises and government organizations

• Competitive advantages in Big Data with strong IP and patent position

• Growing pipeline of near-term opportunities across expanding use cases

• Generating increased revenues in last twelve months

• Profitable and cash flow positive; no debt

• Ownership: Privately held by employees and venture investors

Market Opportunity

• Big Data Market forecasted to be $11.6B in 2012, with CAGR of 28.0% over the next 5 years

• 40% per year data growth, cloud adoption, mobile usage and improved real-time, predictive analytics underpin Objectivity’s growth opportunities

• Strategically positioned as key Big Data enabler that pulls through servers, DBs and file stores

© Objectivity Inc 201204/10/2023

Page 53: Silicon valley nosql meetup  april 2012

Brian Clark

VP Product Marketing, Objectivity Inc.

http://www.infinitegraph.com

http://www.objectivity.com

04/10/2023 © Objectivity Inc 2012 53