Cassandra summit

Post on 30-Jun-2015

1.120 views 0 download

Transcript of Cassandra summit

Friday, August 10, 12

Friday, August 10, 12

Friday, August 10, 12

Friday, August 10, 12

Friday, August 10, 12

Flexible schema

Easily to scale, increased redundancy

Fast enough for web requests

Consolidate existing services

Hadoop support

Friday, August 10, 12

Friday, August 10, 12

Friday, August 10, 12

FUDNo Ad-hoc queries

No Indexes

No range queries

Limited tooling

Code complexity

Friday, August 10, 12

Friday, August 10, 12

Friday, August 10, 12

Thrift

CQL

REST

Friday, August 10, 12

SOLR Schema

<?xml version="1.0" encoding="UTF-8" ?><schema name="my_column_family" version="1.0">

<types> <fieldType name="string" class="solr.StrField"/> <fieldType name="date" class="solr.DateField"/> </types>

<fields> <field name="id" type="string" indexed="true" stored="true"/> <field name="name" type="string" indexed="true" stored="true"/> <field name="released_at" type="date" indexed="true" stored="true"/> </fields>

<uniqueKey>id</uniqueKey> <defaultSearchField>name</defaultSearchField></schema>

Friday, August 10, 12

Basic Queries

http://localhost:8983/solr/my_keyspace.my_column_family/select?q=name:foo

SELECT * FROM my_column_family WHERE solr_query='name:foo';

Friday, August 10, 12

Wide Rows<?xml version="1.0" encoding="UTF-8" ?>

<schema name="my_column_family" version="1.0">

<types> <fieldType name="string" class="solr.StrField"/> <fieldType name="date" class="solr.DateField"/> </types>

<fields> <field name="id" type="string" indexed="true" stored="true"/> <field name="name" type="string" indexed="true" stored="true"/> <field name="released_at" type="date" indexed="true" stored="true"/> <dynamicField name="wide_*" type="string" indexed="true" stored="true"/> </fields>

<uniqueKey>id</uniqueKey> <defaultSearchField>name</defaultSearchField></schema>

Friday, August 10, 12

Fuzzy Search<schema name="my_column_family" version="1.0">

<types> <fieldType name="string" class="solr.StrField"/> <fieldType name="ngram" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <tokenizer class="solr.KeywordTokenizerFactory"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="1" preserveOriginal="1"/> <filter class="solr.NGramFilterFactory" minGramSize="2" maxGramSize="15"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.WhitespaceTokenizerFactory"/> </analyzer> </fieldType> </types> <fields> <field name="id" type="string" indexed="true" stored="true" /> <field name="name" type="string" indexed="true" stored="true" /> <field name="name_fuzzy" type="ngram" indexed="true" stored="true" /> </fields> <copyField source="name" dest="name_fuzzy"/> <uniqueKey>id</uniqueKey> <defaultSearchField>name</defaultSearchField></schema>

Friday, August 10, 12

• Full-text indexing

• Trigrams

• Rich data formats (PDF, Word, HTML)

• Easy interop (REST,CSV, XML, JSON)

• Geo-spatial search

• Highlighting

• Auto-suggest

• Faceted search and filtering

Friday, August 10, 12

Friday, August 10, 12

Storm

Friday, August 10, 12

Storm

Friday, August 10, 12

Increased performance by 700% while growing

data by 500%

Friday, August 10, 12

Reduced operational costs by 40%

Friday, August 10, 12

Deleted 15,000 lines of code

Friday, August 10, 12

Friday, August 10, 12