Introduction to CouchDB - LA Hacker News

22
Michael Parker February 24, 2013

description

Presented to the LA Hacker News meetup on Apache CouchDB. Covers basics on RESTful principles, CRUD operations, views, MVCC, and performance. 30 minutes with Q&A.

Transcript of Introduction to CouchDB - LA Hacker News

Page 1: Introduction to CouchDB - LA Hacker News

Michael ParkerFebruary 24, 2013

Page 2: Introduction to CouchDB - LA Hacker News

What is CouchDB?● One of those hipster NoSQL databases● Document-oriented, not relational

○ No joins, prefer denormalization● RESTful API for all operations● Views add structure via secondary indexes● Other perks:

○ Multiversion concurrency control (lock-free)○ MapReduce framework○ Great web admin interface○ Multi-master replication

Page 3: Introduction to CouchDB - LA Hacker News

Document-oriented● Like a key-value store, flat namespace

○ In CouchDB, key is called a document identifier, or doc id

● Documents stored in JSON format○ "Schemaless," but the schema resides in your app○ JSON also format for HTTP body

● Similar: MongoDB (no relation), Redis, Cassandra

Page 4: Introduction to CouchDB - LA Hacker News

JSON Document in CouchDB{ "_id": "emp_001", "_rev": "1-4c6114c65e295552ab1019e2b046b10e", "name": { "first": "Dante", "last": "Hicks", }, "phone": "310-555-1212", "interests": ["philosophy", "Star Wars"]}

Page 5: Introduction to CouchDB - LA Hacker News

RESTful URLs● All resources in database identified by URL● Basic URL structure:

○ /db_id: A database○ /db_id/doc_id: A document in a database

● URL components and JSON fields starting with an underscore are special, e.g.:○ /_config: CouchDB configuration parameters○ /db_id/_all_docs: A cursor across all

documents○ /db_id/_design/design_doc: A design

document for a database

Page 6: Introduction to CouchDB - LA Hacker News

RESTful Methods● HTTP methods for CRUD actions

○ GET: Read data, typically a document○ HEAD: Like GET without a body, typically used to

check if a document exists○ PUT: Creates new databases, documents, and other

resources○ POST: Updates these resources○ DELETE: Deletes these resources

Page 7: Introduction to CouchDB - LA Hacker News

RESTful Status Codes● HTTP status codes for server responses

○ 200 OK: Request completed successfully (e.g. retrieving, updating, deleting documents)

○ 201 Created: Resource created (used with PUT)○ 202 Accepted: Request completed and operation

pending (e.g. for background operations)○ 401 Unauthorized: Bad username or password○ 404 Not Found: Resource missing○ 409 Conflict: MVCC failure, or concurrent

modification to a document○ 500 Internal Server Error: Everybody panic

Page 8: Introduction to CouchDB - LA Hacker News

Creating a Database● Request (abbr.):

PUT /new_db/ HTTP/1.1

● Response (abbr.):HTTP/1.1 201 Created

{"ok": true}

Page 9: Introduction to CouchDB - LA Hacker News

Creating a Database● Request (abbr.):

GET /_all_dbs HTTP/1.1

● Response (abbr.):HTTP/1.1 200 OK

["new_db"]

Page 10: Introduction to CouchDB - LA Hacker News

Creating a Document● Request (abbr.):

PUT /my_db/emp_001 HTTP/1.1

{ "name": { "first": "Dante", "last": "Hicks", }, "phone": "310-555-1212", "interests": ["philosophy", "Star Wars"]}

Page 11: Introduction to CouchDB - LA Hacker News

Creating a Document● Response (abbr.):

HTTP/1.1 201 Created

{ "ok": true, "id": "emp_001", "rev": "1-4c6114c65e295552ab1019e2b046b10e"}

Page 12: Introduction to CouchDB - LA Hacker News

Retrieving a Document● Request (abbr.):

GET /my_db/emp_001 HTTP/1.1

Page 13: Introduction to CouchDB - LA Hacker News

Retrieving a Document● Response (abbr.):

HTTP/1.1 200 OK

{ "_id": "emp_001", "_rev": "1-4c6114c65e295552ab1019e2b046b10e", "name": { "first": "Dante", "last": "Hicks", }, "phone": "310-555-1212", "interests": ["philosophy", "Star Wars"]}

Page 14: Introduction to CouchDB - LA Hacker News

Retrieving a Document● Request (abbr.):

GET /my_db/missing_emp_007 HTTP/1.1

● Response (abbr.):HTTP/1.1 404 Object Not Found

{ "error": "not_found", "reason": "missing"}

Page 15: Introduction to CouchDB - LA Hacker News

Views● Secondary indexes for querying by other

than _id● Written in JavaScript, executed with Mozilla

SpiderMonkey engine● Defined in the design document

Page 16: Introduction to CouchDB - LA Hacker News

Views● Define view emps_by_interest:

function(emp_doc) { for (var i = 0; i < emp_doc.interests.length; ++i) { var interest = emp_doc.interests[i]; emit(interest.toLowerCase(), null); }}

● Request (abbr):GET /my_db/_design/my_dd/_view/emps_by_interest?key=philosophy HTTP/1.1

Page 17: Introduction to CouchDB - LA Hacker News

Views● Response (abbr):

HTTP/1.1 200 OK

{ "total_rows": 1, "offset": 0, "rows": [ {"id": "emp_001", "key": "philosophy", "value": null} ]}

● Append &include_docs=true in request to return documents with results

Page 18: Introduction to CouchDB - LA Hacker News

Multiversion Concurrency Control (MVCC)● Every document has a _rev attribute

○ Only required field other than _id● Ensures that client is updating latest data

○ No accidental clobbering● Lock-free concurrency control● If multiple clients attempt to write

concurrently, exactly one succeeds every time○ Always making "forward progress"

Page 19: Introduction to CouchDB - LA Hacker News

Multiversion Concurrency Control (MVCC)

GET id=doc_id

_rev=X, v=1

GET id=doc_id

_rev=X, v=1

t

_rev=X, v=2_rev=Y, v=2

_rev=X, v=2

HTTP 409

GET id=doc_id

_rev=Y, v=2

_rev=Y, v=3

HTTP 200, _rev=Y

_rev=Z, v=3HTTP 200, _rev=Z

CLIENT 1 CLIENT 2

Page 20: Introduction to CouchDB - LA Hacker News

Benchmarking

https://github.com/mgp/iron-cushion

● Setup:○ Server: Intel Core 2 2.83GHz quad-core, 4GB RAM○ Client: 1.83 GHz Intel Core Duo MacBook○ 100Mbit LAN, 100 concurrent connections○ first, bulk insert 2,000,000 documents○ second, intersperse 20,000 create and read

operations, 30,000 update and delete operations● Caveat: no indexes

Page 21: Introduction to CouchDB - LA Hacker News

BenchmarkingbulkInsertRate: 10,003.030 docs/sec

createProcessingRate: 949.141 docs/secreadProcessingRate: 9,015.862 docs/secupdateProcessingRate: 980.172 docs/secdeleteProcessingRate: 980.154 docs/sec

Page 22: Introduction to CouchDB - LA Hacker News

Thanks!http://couchdb.apache.org/

[email protected]://github.com/mgp

http://mgp.github.com/couchdb-la-hn.pdf