Look Ma! No more blobs

Aparna Chaudhary

NoSQL matters, @Cologne Germany 2013

EMBRACEPOLYGLOT

PERSISTENCE!

STOP RDBMS ABUSE!

KNOW YOUR USE CASE

Extract

Read XML

We don't do rocket science...

Use Case

Runtime support for document types

Metadata definition provided at runtime

Document type names - max 50 char

Look up content based on metadata

Challenges

Storage of up to one million documents of 10KB to 2GB per document type per year

Write 1MB < x msec

Retrieve 1MB < y msec

......and detailsRA

But…the Numbers make it interesting...

File System

MongoDB

Document Management

if you want to store files, its logical to use file system.

ain't it?

File System

✓ Ease of Use

✓ No special skill-set

✓ Backup and Recovery

✓ It’s free!

How do I name them?

Support for metadata storage?

Performance with too many small files?

Query - Administration?

High Availability?

Limitation on total number of

files?

Relational database

IntegrityConsistency

Durability

Atomicity

JoinsBackups

High Availability

You name it, We have it!

Aggregations

RDBMS Developer’s Perspective

Challenge #1

We need runtime support for document type.

Challenge #1

DOC_1 DOC_2 DOC_3

DOC_4 DOC_5 DOC_6

Dynamic DDL Generation

DOC_1 DOC_2 DOC_3

DOC_4 DOC_5 DOC_6

Dynamic DDL Generation

Challenge #1String concatenations

are ugly…

String concatenations are ugly…

Challenge #1Let's build a utility.

Let's build a utility.

Challenge #1

More Work More Work

Challenge #2

Document type is 50 char long

Challenge #2TABLE NAME LIMITS

Wait…SQL-92 says 128 Char

?We rule. Let's support only

30 char.

TABLE NAME LIMITS

Wait…SQL-92 says 128 Char

?We rule. Let's support only

30 char.

Challenge #2

DOC_TYPE_MAPPING

Let's create a mapping table.

DOC_TYPE_MAPPING

Let's create a mapping table.

Challenge #2

Ugly unreadable table names!

So...f inally...Read XML

Dynamic DDL generation

Document Type Alias

DocumentTypeDefined

Extract Metadata

Store Metadata

Store Content

Simple use case becomes complex...

Remember...Our Challenge

Let's see if we are in spec for response time.

Aah..what about performance now?

MongoDB

Document BasedGridFS

B-TreeDynamic Schema

BSON Query

Scalablehttp://www.10gen.com/presentations/storage-engine-internals

Complex Transaction

F1 F2 F3 F4 F5ID1

F2 F3 F4 F5 F6

F2 F3 F4 F5 Fx

Concepts

Database

Collection

Collection Collection Collection

CollectionCollection

Database

Collection

Database

Collection

Database

Collection

Table = Collection

Column = Field

Row = Document

Database = Database

GridFS

MongoDB divides the

large content into

chunks

Stores Metadata and Chunks separately

http://docs.mongodb.org/manual/core/gridfs/

> mybucket.files{ "_id" : ObjectId("514d5cb8c2e6ea4329646a5c"),

"chunkSize" : NumberLong(262144),

"length" : NumberLong(103015),

"md5" : "34d29a163276accc7304bd69c5520e55",

"filename" : "health_record_2.xml",

"contentType" : application/xml,

"uploadDate" : ISODate("2013-03-23T07:41:44.907Z"),

"aliases" : null,

"metadata" : { "fname" : "Aparna", "lname" : "Chaudhary","country" : "Netherlands" }

ObjectId - 12 Byte BSON:4 Byte - Seconds since Epoch3 Byte - Machine Id2 Byte - Process Id3 Byte - Counter

> mybucket.chunks

{ "_id" : ObjectId("514d5cb8c2e6ea4329646a5d"), "files_id" : ObjectId("514d5cb8c2e6ea4329646a5c"),

"n" : 0,

"data" : BinData(0,...)

?I'm storing 10KB file, but

would it use 256KB on disk?

Last Chunk =

FileSize % 256+

Metadata overhead

1128KB

256 256 256 104 + x

10 + x

Chunk is as big as it

needs to be...

Challenge #1

MongoDB supports Dynamic Schema.

You can use collection per docType and they are created dynamically.

We need runtime support for document type.

Challenge #2

Document type is 50 char long

MongoDB namespace can be up to 123 char.

So...f inally...

Simple use case remains simple...well becomes

simpler...

Read XML

Extract Metadata

Store Metadata & Content

Remember...Our Challenge

Let's see if we are in spec for response time.

Performance test is part of our definition of 'DONE'

BEcause seeing is believing!

‣ GridFS 2.4.0

‣ PostgreSQL 9.2

‣ Spring Data

‣ JMeter 2.7

‣ Mac OS X 10.8.3 2.3GHz Quad-Core Intel Core i7, 16GB RAM

https://github.com/aparnachaudhary/nosql-matters-demo

EMBRACEPOLYGLOT

PERSISTENCE!

STOP RDBMS ABUSE!

KNOW YOUR USE CASE

@aparnachaudhary

Java Developer, Data Lover

Eindhoven, Netherlands

http://blog.aparnachaudhary.com/

@aparnachaudhary

Thank You!

Look Ma! No more blobs

Technology

Transcript of Look Ma! No more blobs

Look Ma' No Hands - Automating Security the RightScale Way

Finding Moments of the blobs

Documentation Foundation Spectrum Relational Tables XML/Relational Database (with some fields) XML Blobs (with some fields) File Systems XML Blobs in Database.

files.microscan.comfiles.microscan.com/30years/itran_ocvfontverificationtool.pdf · blobs (such as ink blobs) and more. TYPICAL APPLICATIONS Verifying the correctness and quality

Look Ma, No TV

Science Experiment Blobs in a Bottle

LOBS, BLOBS, CLOBS: Dealing with Attachments in APEX

From Pixels to “Blobs” - Computer Graphicsgraphics.cs.cmu.edu/courses/15-463/2004_fall/www/Lectures/BlobProcessing.pdfFrom Pixels to “Blobs” ... Closing : smooth sections of

Look, Ma, no hands! Coping with Repetitive Strain Injury

Making Blobs with a Textile Mould - dhu.edu.cn

Look Ma, no Hands! · Look Ma, no Hands! Ensemble data Assimilation without Ensembles Christian L. Keppenne 1,2, Guillaume Vernieres1,2, Robin M. Kovach1,2, Michele.M. Rienecker1

Blobs in Azure

OpenStreetMap London building blobs

Beyond Brain Blobs - Francisco Pereira

BLOBs & SOBs · 2020. 9. 22. · BLOBs & SOBs p. 3 Elevator • This play can be used to get your shooter or inbounder a great look at the top of the key. • The timing of the elevator

Look Ma , Fair Hands

angst blobs

Binary Blobs Attack!!!

Look Ma! No Servers! - Introduction to Serverless Architectures

Blobs and color vision - Eye, Brain, and Visionhubel.med.harvard.edu/papers/Hubel1986CellBiochemBiophys.pdf · Blobs and Color Vision DAVID H. HUBEL Department of Neurobiology, Harvard