Евгений Курпилянский "Индексирование поверх Cassandra"....
-
Upload
it-people -
Category
Technology
-
view
538 -
download
1
Transcript of Евгений Курпилянский "Индексирование поверх Cassandra"....
![Page 1: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/1.jpg)
Indexing Cassandra data in SQL-storage
Indexing Cassandra data in SQL-storage
Kurpilyansky Eugene
SKB Kontur
December 9th, 2013
![Page 2: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/2.jpg)
Indexing Cassandra data in SQL-storage
What do we want?
Suppose, we want to store objects of di�erent types in
Cassandra.
Any object has a primary string key.
Cassandra is well-suited for using it as key-value storage.
But we usually want to search among all objects of same type
by some criterion.
Results of searching must be consistent and re�ect current
state of database.
How can we implement storage which satis�es these
requirements?
![Page 3: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/3.jpg)
Indexing Cassandra data in SQL-storage
What do we want?
Suppose, we want to store objects of di�erent types in
Cassandra.
Any object has a primary string key.
Cassandra is well-suited for using it as key-value storage.
But we usually want to search among all objects of same type
by some criterion.
Results of searching must be consistent and re�ect current
state of database.
How can we implement storage which satis�es these
requirements?
![Page 4: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/4.jpg)
Indexing Cassandra data in SQL-storage
What do we want?
Suppose, we want to store objects of di�erent types in
Cassandra.
Any object has a primary string key.
Cassandra is well-suited for using it as key-value storage.
But we usually want to search among all objects of same type
by some criterion.
Results of searching must be consistent and re�ect current
state of database.
How can we implement storage which satis�es these
requirements?
![Page 5: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/5.jpg)
Indexing Cassandra data in SQL-storage
What do we want?
Suppose, we want to store objects of di�erent types in
Cassandra.
Any object has a primary string key.
Cassandra is well-suited for using it as key-value storage.
But we usually want to search among all objects of same type
by some criterion.
Results of searching must be consistent and re�ect current
state of database.
How can we implement storage which satis�es these
requirements?
![Page 6: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/6.jpg)
Indexing Cassandra data in SQL-storage
What do we want?
Suppose, we want to store objects of di�erent types in
Cassandra.
Any object has a primary string key.
Cassandra is well-suited for using it as key-value storage.
But we usually want to search among all objects of same type
by some criterion.
Results of searching must be consistent and re�ect current
state of database.
How can we implement storage which satis�es these
requirements?
![Page 7: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/7.jpg)
Indexing Cassandra data in SQL-storage
Using native Cassandra indexes
We can use native Cassandra indexes.
Advantages
There is no need to support additional storage.
Disadvantages
Every custom query may require new CF-structure for
e�ective searching.
SQL-indexes are more e�cient than Cassandra's indexes.
There exist a lot of complex indexes (e.g. full-text search
indexing).
![Page 8: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/8.jpg)
Indexing Cassandra data in SQL-storage
Using native Cassandra indexes
We can use native Cassandra indexes.
Advantages
There is no need to support additional storage.
Disadvantages
Every custom query may require new CF-structure for
e�ective searching.
SQL-indexes are more e�cient than Cassandra's indexes.
There exist a lot of complex indexes (e.g. full-text search
indexing).
![Page 9: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/9.jpg)
Indexing Cassandra data in SQL-storage
Using native Cassandra indexes
We can use native Cassandra indexes.
Advantages
There is no need to support additional storage.
Disadvantages
Every custom query may require new CF-structure for
e�ective searching.
SQL-indexes are more e�cient than Cassandra's indexes.
There exist a lot of complex indexes (e.g. full-text search
indexing).
![Page 10: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/10.jpg)
Indexing Cassandra data in SQL-storage
Using native Cassandra indexes
We can use native Cassandra indexes.
Advantages
There is no need to support additional storage.
Disadvantages
Every custom query may require new CF-structure for
e�ective searching.
SQL-indexes are more e�cient than Cassandra's indexes.
There exist a lot of complex indexes (e.g. full-text search
indexing).
![Page 11: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/11.jpg)
Indexing Cassandra data in SQL-storage
Using native Cassandra indexes
We can use native Cassandra indexes.
Advantages
There is no need to support additional storage.
Disadvantages
Every custom query may require new CF-structure for
e�ective searching.
SQL-indexes are more e�cient than Cassandra's indexes.
There exist a lot of complex indexes (e.g. full-text search
indexing).
![Page 12: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/12.jpg)
Indexing Cassandra data in SQL-storage
Using synchronization with SQL-storage
Main idea
Main idea
Run IndexService application which is synchronizing data in
SQL-storage with data in Cassandra (constantly,
in background thread).
To perform a search we should make a query to IndexService
which will return the search result after �nishing SQL-storage
synchronization process.
![Page 13: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/13.jpg)
Indexing Cassandra data in SQL-storage
Using synchronization with SQL-storage
Main idea
Main idea
Run IndexService application which is synchronizing data in
SQL-storage with data in Cassandra (constantly,
in background thread).
To perform a search we should make a query to IndexService
which will return the search result after �nishing SQL-storage
synchronization process.
![Page 14: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/14.jpg)
Indexing Cassandra data in SQL-storage
Using synchronization with SQL-storage
Implementation of EventLog
Create event log
One event per one write-request or delete-request.
Event log sorted by time of event.
![Page 15: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/15.jpg)
Indexing Cassandra data in SQL-storage
Using synchronization with SQL-storage
Implementation of EventLog
Create event log
One event per one write-request or delete-request.
Event log sorted by time of event.
![Page 16: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/16.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of EventLog
Event
string EventId;
long Timestamp;
string ObjectId;
interface IEventLog
void AddEvent(Event event);
IEnumerable<Event> GetEvents(long fromTicks);
New implementation of IObjectStorage
Before writing or deleting objects call method
IEventLog.AddEvent.
![Page 17: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/17.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of EventLog
Event
string EventId;
long Timestamp;
string ObjectId;
interface IEventLog
void AddEvent(Event event);
IEnumerable<Event> GetEvents(long fromTicks);
New implementation of IObjectStorage
Before writing or deleting objects call method
IEventLog.AddEvent.
![Page 18: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/18.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of EventLog
Event
string EventId;
long Timestamp;
string ObjectId;
interface IEventLog
void AddEvent(Event event);
IEnumerable<Event> GetEvents(long fromTicks);
New implementation of IObjectStorage
Before writing or deleting objects call method
IEventLog.AddEvent.
![Page 19: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/19.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of EventLog
EventLog.AddEvent(Event event)
Create column:
ColumnName = event.Timestamp + ':' + event.EventId
ColumnValue = event
EventLog.GetEvents(long fromTicks)
Execute get_slice from exclusive column for one row.
![Page 20: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/20.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of EventLog
EventLog.AddEvent(Event event)
Create column:
ColumnName = event.Timestamp + ':' + event.EventId
ColumnValue = event
EventLog.GetEvents(long fromTicks)
Execute get_slice from exclusive column for one row.
We should split all event log into rows using
PartitionInterval to limit size of rows.
PartitionInterval is some constant period of time (e.g.
one hour, or six minutes).
![Page 21: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/21.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of EventLog
We should split all event log into rows using
PartitionInterval to limit size of rows.
PartitionInterval is some constant period of time (e.g.
one hour, or six minutes).
EventLog.AddEvent(Event event)
Create column:
RowKey = event.Timestamp / PartitionInterval.Ticks
ColumnName = event.Timestamp + ':' + event.EventId
ColumnValue = event
EventLog.GetEvents(long fromTicks)
Execute get_slice from exclusive column for one or more rows.
![Page 22: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/22.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
IndexService
It has a local SQL-storage (one storage per one service replica).
There is one SQL-table per one type of object.
There is one speci�c SQL-table for storing times of last
synchronization for each type of object.
There is one background thread per one type of object, which
is reading event log and updating SQL-storage.
For executing incoming SQL-query, we can use data from
SQL-storage and a little range of events.
![Page 23: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/23.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
IndexService
It has a local SQL-storage (one storage per one service replica).
There is one SQL-table per one type of object.
There is one speci�c SQL-table for storing times of last
synchronization for each type of object.
There is one background thread per one type of object, which
is reading event log and updating SQL-storage.
For executing incoming SQL-query, we can use data from
SQL-storage and a little range of events.
![Page 24: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/24.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
IndexService
It has a local SQL-storage (one storage per one service replica).
There is one SQL-table per one type of object.
There is one speci�c SQL-table for storing times of last
synchronization for each type of object.
There is one background thread per one type of object, which
is reading event log and updating SQL-storage.
For executing incoming SQL-query, we can use data from
SQL-storage and a little range of events.
![Page 25: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/25.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
IndexService
It has a local SQL-storage (one storage per one service replica).
There is one SQL-table per one type of object.
There is one speci�c SQL-table for storing times of last
synchronization for each type of object.
There is one background thread per one type of object, which
is reading event log and updating SQL-storage.
For executing incoming SQL-query, we can use data from
SQL-storage and a little range of events.
![Page 26: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/26.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
IndexService
It has a local SQL-storage (one storage per one service replica).
There is one SQL-table per one type of object.
There is one speci�c SQL-table for storing times of last
synchronization for each type of object.
There is one background thread per one type of object, which
is reading event log and updating SQL-storage.
For executing incoming SQL-query, we can use data from
SQL-storage and a little range of events.
![Page 27: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/27.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Periodic synchronization action
Set startSynchronizationTime = NowTicks.
Find all events which should be processed.
Process these events: update SQL-storage and keep
unprocessed events (they should be processed on the next
iteration).
Update time of last synchronization to
startSynchronizationTime in SQL-storage.
![Page 28: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/28.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Periodic synchronization action
Set startSynchronizationTime = NowTicks.
Find all events which should be processed.
Process these events: update SQL-storage and keep
unprocessed events (they should be processed on the next
iteration).
Update time of last synchronization to
startSynchronizationTime in SQL-storage.
![Page 29: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/29.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Periodic synchronization action
Set startSynchronizationTime = NowTicks.
Find all events which should be processed.
Process these events: update SQL-storage and keep
unprocessed events (they should be processed on the next
iteration).
Update time of last synchronization to
startSynchronizationTime in SQL-storage.
![Page 30: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/30.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Periodic synchronization action
Set startSynchronizationTime = NowTicks.
Find all events which should be processed.
Process these events: update SQL-storage and keep
unprocessed events (they should be processed on the next
iteration).
Update time of last synchronization to
startSynchronizationTime in SQL-storage.
![Page 31: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/31.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
ProcessEvents(Event[] events)
This function actualizes values of related objects in SQL-storage.
Remember, that we update object after creating an event.
So, we can not process some of events at the moment, because
correspoding object isn't updated yet.
![Page 32: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/32.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
ProcessEvents(Event[] events)
This function actualizes values of related objects in SQL-storage.
Remember, that we update object after creating an event.
So, we can not process some of events at the moment, because
correspoding object isn't updated yet.
![Page 33: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/33.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Event[] ProcessEvents(Event[] events)
This function actualizes values of related objects in SQL-storage
and returns events, which have not been processed.
How will this function be implemented?
For every event we should analyze corresponding objects from both
Cassandra and SQL-storage.
![Page 34: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/34.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Event[] ProcessEvents(Event[] events)
This function actualizes values of related objects in SQL-storage
and returns events, which have not been processed.
How will this function be implemented?
For every event we should analyze corresponding objects from both
Cassandra and SQL-storage.
![Page 35: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/35.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Event[] ProcessEvents(Event[] events)
This function actualizes values of related objects in SQL-storage
and returns events, which have not been processed.
How will this function be implemented?
For every event we should analyze corresponding objects from both
Cassandra and SQL-storage.
![Page 36: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/36.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Example 1
event = {Timestamp: 2008}
cassObj = {Timestamp: 2008, School: 'USU'}
sqlObj = {Timestamp: 2005, School: 'AESÑ USU'}
What should we do?
![Page 37: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/37.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Example 1
event = {Timestamp: 2008}
cassObj = {Timestamp: 2008, School: 'USU'}
sqlObj = {Timestamp: 2005, School: 'AESÑ USU'}
Write cassObj in SQL-storage and mark event as processed.
![Page 38: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/38.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Example 1
event = {Timestamp: 2008}
cassObj = {Timestamp: 2008, School: 'USU'}
sqlObj = {Timestamp: 2005, School: 'AESÑ USU'}
Write cassObj in SQL-storage and mark event as processed.
Example 2
event = {Timestamp: 2012}
cassObj = {Timestamp: 2008, School: 'USU'}
sqlObj = {Timestamp: 2005, School: 'AESÑ USU'}
What should we do?
![Page 39: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/39.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Example 1
event = {Timestamp: 2008}
cassObj = {Timestamp: 2008, School: 'USU'}
sqlObj = {Timestamp: 2005, School: 'AESÑ USU'}
Write cassObj in SQL-storage and mark event as processed.
Example 2
event = {Timestamp: 2012}
cassObj = {Timestamp: 2008, School: 'USU'}
sqlObj = {Timestamp: 2005, School: 'AESÑ USU'}
Timestamp of event is greater than timestamp of cassObj.
Probably, it needs to wait for updating of object.
![Page 40: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/40.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Example 1
event = {Timestamp: 2008}
cassObj = {Timestamp: 2008, School: 'USU'}
sqlObj = {Timestamp: 2005, School: 'AESÑ USU'}
Write cassObj in SQL-storage and mark event as processed.
Example 2
event = {Timestamp: 2012}
cassObj = {Timestamp: 2008, School: 'USU'}
sqlObj = {Timestamp: 2005, School: 'AESÑ USU'}
Timestamp of event is greater than timestamp of cassObj.
Probably, it needs to wait for updating of object.
Write cassObj in SQL-storage and mark event as unprocessed.
![Page 41: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/41.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Example 3
event = {Timestamp: 1997}
cassObj is missing
sqlObj is missing
What should we do?
![Page 42: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/42.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Example 3
event = {Timestamp: 1997}
cassObj is missing
sqlObj is missing
Probably, that event corresponds to the creation of object.
![Page 43: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/43.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Example 3
event = {Timestamp: 1997}
cassObj is missing
sqlObj is missing
Probably, that event corresponds to the creation of object.
Mark event as unprocessed.
![Page 44: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/44.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Example 3
event = {Timestamp: 1997}
cassObj is missing
sqlObj is missing
Probably, that event corresponds to the creation of object.
Mark event as unprocessed.
Example 4
event = {Timestamp: 2017}
cassObj is missing
sqlObj = {Timestamp: 2012, School: 'UFU'}
What should we do?
![Page 45: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/45.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Example 3
event = {Timestamp: 1997}
cassObj is missing
sqlObj is missing
Probably, that event corresponds to the creation of object.
Mark event as unprocessed.
Example 4
event = {Timestamp: 2017}
cassObj is missing
sqlObj = {Timestamp: 2012, School: 'UFU'}
Two cases are possible:
1 That event corresponds to the deletion of object.
2 That event corresponds to the creation of object. sqlObj is
not missing, because there were two operationsin a row: delete
and create.
![Page 46: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/46.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Example 4
event = {Timestamp: 2017}
cassObj is missing
sqlObj = {Timestamp: 2012, School: 'UFU'}
Two cases are possible:
1 That event corresponds to the deletion of object.
2 That event corresponds to the creation of object. sqlObj is
not missing, because there were two operationsin a row: delete
and create.
Delete sqlObj from SQL-storage and mark event as unprocessed.
![Page 47: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/47.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Event[] ProcessEvents(Event[] events)
Read objects, which occured in these events, from Cassandra and
SQL-storage (some of them can be missing).
For each (event, cassObj, sqlObj) do
If cassObj is not missing
Save cassObj in SQL-storageIf event.Timestamp <= cassObj.Timestamp
then mark event as processed;
else mark event as unprocessed.
else (i.e. cassObj is missing)
Delete sqlObj from SQL-storage if it's not missing.
Mark event as unprocessed.
Return events which has been marked as unprocessed.
![Page 48: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/48.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Periodic synchronization action
Set startSynchronizationTime = NowTicks.
Find all events which should be processed.
Process these events: update SQL-storage and keep
unprocessed events (they should be processed on the next
iteration).
Update time of last synchronization to
startSynchronizationTime in SQL-storage.
![Page 49: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/49.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
What events should we use as arguments in ProcessEvents
function?
Of course, all unprocessed events from previous iteration.
Also all new events, i.e. IEventLog.GetEvents(fromTicks).
What is fromTicks?
fromTicks = lastSynchronizationTime?
No. Unfortunately, any operation with Cassandra can be
executed for a long time.
This time is limited by
writeTimeout = attemptsCount · connectionTimeout.
We should make undertow back, otherwise we can lose some
events.
fromTicks = lastSynchronizationTime - writeTimeout
![Page 50: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/50.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
What events should we use as arguments in ProcessEvents
function?
Of course, all unprocessed events from previous iteration.
Also all new events, i.e. IEventLog.GetEvents(fromTicks).
What is fromTicks?
fromTicks = lastSynchronizationTime?
No. Unfortunately, any operation with Cassandra can be
executed for a long time.
This time is limited by
writeTimeout = attemptsCount · connectionTimeout.
We should make undertow back, otherwise we can lose some
events.
fromTicks = lastSynchronizationTime - writeTimeout
![Page 51: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/51.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
What events should we use as arguments in ProcessEvents
function?
Of course, all unprocessed events from previous iteration.
Also all new events, i.e. IEventLog.GetEvents(fromTicks).
What is fromTicks?
fromTicks = lastSynchronizationTime?
No. Unfortunately, any operation with Cassandra can be
executed for a long time.
This time is limited by
writeTimeout = attemptsCount · connectionTimeout.
We should make undertow back, otherwise we can lose some
events.
fromTicks = lastSynchronizationTime - writeTimeout
![Page 52: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/52.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
What events should we use as arguments in ProcessEvents
function?
Of course, all unprocessed events from previous iteration.
Also all new events, i.e. IEventLog.GetEvents(fromTicks).
What is fromTicks?
fromTicks = lastSynchronizationTime?
No. Unfortunately, any operation with Cassandra can be
executed for a long time.
This time is limited by
writeTimeout = attemptsCount · connectionTimeout.
We should make undertow back, otherwise we can lose some
events.
fromTicks = lastSynchronizationTime - writeTimeout
![Page 53: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/53.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
What events should we use as arguments in ProcessEvents
function?
Of course, all unprocessed events from previous iteration.
Also all new events, i.e. IEventLog.GetEvents(fromTicks).
What is fromTicks?
fromTicks = lastSynchronizationTime?
No. Unfortunately, any operation with Cassandra can be
executed for a long time.
This time is limited by
writeTimeout = attemptsCount · connectionTimeout.
We should make undertow back, otherwise we can lose some
events.
fromTicks = lastSynchronizationTime - writeTimeout
![Page 54: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/54.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
What events should we use as arguments in ProcessEvents
function?
Of course, all unprocessed events from previous iteration.
Also all new events, i.e. IEventLog.GetEvents(fromTicks).
What is fromTicks?
fromTicks = lastSynchronizationTime?
No. Unfortunately, any operation with Cassandra can be
executed for a long time.
This time is limited by
writeTimeout = attemptsCount · connectionTimeout.
We should make undertow back, otherwise we can lose some
events.
fromTicks = lastSynchronizationTime - writeTimeout
![Page 55: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/55.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
What events should we use as arguments in ProcessEvents
function?
Of course, all unprocessed events from previous iteration.
Also all new events, i.e. IEventLog.GetEvents(fromTicks).
What is fromTicks?
fromTicks = lastSynchronizationTime?
No. Unfortunately, any operation with Cassandra can be
executed for a long time.
This time is limited by
writeTimeout = attemptsCount · connectionTimeout.
We should make undertow back, otherwise we can lose some
events.
fromTicks = lastSynchronizationTime - writeTimeout
![Page 56: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/56.jpg)
Indexing Cassandra data in SQL-storage
Synchronizing SQL-storage with Cassandra
Implementation of IndexService
Executing search request
![Page 57: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/57.jpg)
Indexing Cassandra data in SQL-storage
Advantages.
Scalability.
Availability.
Fault tolerance.
Sharding.
![Page 58: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/58.jpg)
Indexing Cassandra data in SQL-storage
Advantages.
Scalability.
Availability.
Fault tolerance.
Sharding.
![Page 59: Евгений Курпилянский "Индексирование поверх Cassandra". Выступление на Cassandra conf 2013](https://reader031.fdocuments.in/reader031/viewer/2022020110/5562191ed8b42af2128b5549/html5/thumbnails/59.jpg)
Indexing Cassandra data in SQL-storage
Questions
Thank you for your attention. Any questions?