Prophet: a path out of the Cloud
-
Upload
jesse-vincent -
Category
Technology
-
view
3.417 -
download
3
Transcript of Prophet: a path out of the Cloud
You may know me from...
RT (Request Tracker)
Jifty
SVK
Hiveminder
Perl 6
Shirts
2
I’ve been hacking on an open source database
called “Prophet”
3
It has an API like Amazon SimpleDB or Google App Engine’s...
4
It’s designed for “team-scale” apps
5
It’s built for P2P replication and
disconnected use
6
App #1 is the canonical “offline bug tracker”
7
App #2 will probably be a BBS you can sync
over sneakernet
8
But first, a brief digression...
9
...about cloud computing
10
Living in the cloud =
sharecropping
11
(That’s bad)
12
This is a rant
13
The bad old days:
14
Pic of sharecroppers
15
You farmed land you didn’t own...
16
...with tools you couldn’t really afford
17
You paid for it with part of your harvest...
18
It sounded like apretty sweet deal...
19
...until things got bad
20
(Things always got bad)
21
In a bad year, you got further in debt tothe land owner
22
23
The (more recent)bad old days:
24
pic of mainframes
25
You ran code you didn’t own on hardware you
didn’t own
26
Things got a little better:
27
Pic of PCs
28
Things weren’t all rosy:
29
Pic of BSOD
30
Sometimes new versions of software
killed features...
31
...so you were locked in to old versions
32
pic of win 31?
33
Things got ‘better’:
34
rmsche
35
Now, things are getting worse again...
36
37
What happens when your favorite service
goes down?
38
pic of twitter being down
39
...or stops accepting new signups?
40
41
...or gives all your data to the secret police?
42
Pic of yahoo.cn
43
...or starts making arbitrary choices about what’s ‘safe’ content?
44
45
You don’t own the services you use
46
When the service provider cuts you off, that’s it. No recourse.
47
Not so secret shame:I’m a really bad zealot
48
My calendar lives at google.com
49
50
I make a web 2.0 tasklist service called
Hiveminder.com
51
pic of hiveminder
52
Using hosted apps is going to hurt you
53
Data access is important
54
APIs are great
55
...but easy access to a service just makes it
easier to get locked in
56
What about Google Gears, Adobe Air, etc?
57
Great. now you can use your word processer while you’re offline!
58
Pic of wordperfect
59
Real offline apps shouldn’t need servers
60
Real offline appsshould sync like you do
61
I might be a nut job
62
...but smart people seem to agree with me
63
If we want people to have the same degree of user autonomy as we've come to expect from the world, we may have to sit down and code alternatives to Google Docs, Twitter, and EC3 that can live with us on the edge, not be run by third parties.
- Danny O’Brienhttp://www.oblomovka.com/entries/2008/07/16
64
Back to that database thing...
65
Jesse Vincent
66
Chia-liang Kao
67
We work together
68
CL lives in TaipeiJesse lives in Boston
69
Sometimes we needto work face to face
70
TPE - BOS:TPE - HNL:BOS - HNL:
9410 mi5,095 mi5,069 mi
71
Step 1: Go to Hawaii for “work”Step 2: ???Step 3: Prophet!
Our Plan
72
The Plan Backfired
We were there for 8 days
We wrote 8000 lines of Perl
We figured out step 2
73
Step 2:
Build a Disconnected Syncable Database
74
Prophet
75
Prophet
http://syncwith.us/prophet
SD
http://syncwith.us/sd
Getting Prophet
76
A grounded, semirelational,
peer to peer replicated,
disconnected, versioned,
property database with
self-healing conflict resolution
Prophet
77
What do all thosebuzzwords mean?
78
grounded
Runs here
79
grounded
Not here
80
grounded
Runs at the edge
Doesn’t need to run in the cloud
Syncs with services you already use
(We call the adaptors “Foreign Replicas”)
81
Joins are expensive
(They’re still possible)
semirelational
82
Update any replica
Pull from any replica
Push to any replica
Publish a replica
Changes will propagate
peer-to-peer replicated
83
Real-time replication is hard to scale
It only “works” with constant connectivity
I don’t have constant connectivity
Neither do you
Prophet sync can happen whenever
disconnected
84
Every update is recorded as a change set
Change sets don’t lose any data
(so you can use them to go backwards)
All history is introspectable
Replication just replays changesets
versioned
85
Atomic operations
CREATE, READ, UPDATE, DELETE, SEARCH
Record types can have optional validation and canonicalization
Records of the same type do not need to have the same properties
Add and remove properties at will
property database
86
Remembers all conflict resolutions
Syncs all resolutions with your peers
Detects identical conflicts
Uses your peers’ resolutions to “vote” for the winner of a conflict
self-healing conflict resolution
87
Working with Prophet
88
RESTy API
GET /records.json
GET /records/Cars.json
GET /records/Cars/716499-5F9-4AC4-827.json
GET /records/Cars/716499-5F9-4AC4-827/wheels.json
POST /records/Cars.json
POST /records/Cars/716499-5F9-4AC4-827.json
POST /records/Cars/716499-5F9-4AC4-827/wheels.json
89
RESTy API
Yes, we should be using PUT and DELETE
Yes, you can have a commit bit and help us fix it :)
90
Native API(Yes, the core is Perl.)
my $cli = Prophet::CLI->new();
my $cxn = $cli->app_handle->handle;
my $record = Prophet::Record->new( handle => $cxn, type => 'Person' );
my $uuid = $record->create( props => { name => 'Jesse', age => 31 } );
$record->set_prop( name => 'age', value => 32 );
my $people = Prophet::Collection->new( handle => $cxn, type => 'Person' );
$people->matching( sub { shift->prop('species') ne 'cat' } );
91
What could you build with Prophet?
92
A bug tracker: “simple defects”
• id. Status, Summary
• (Arbitrary other properties too)
•History
•Comments
•Attachments
sd
93
./bin/sd ticket create -- summary="Can't sync sd with Google Code" status=new
Created ticket 5 (93BF979E-08C1-11DD-94C3-D4B1FCEE7EC4)
Create
94
./bin/sd ticket search --regex publish
29 } new the online help doesn't describe publish
34 } new publish a static html view of records
35 } new publish should create a static rss file
List and Search
95
./bin/sd ticket update --uuid 93BF979E-08C1-11DD-94C3-D4B1FCEE7EC4 -- status=resolved
Updates
96
Bugs on my laptop aren’t interesting.
97
Jesse
sd publish --to fsck.com:public_html/sd/
CL
sd pull --from http://my.com/~jesse/sd
Sync!
98
My project has a bug tracker
99
Actually, mine use two:
• RT
• hiveminder.com
My project has a bug tracker
99
Foreign Replicas
Prophet makes Foreign Replicas easy
SD gets them "for free"
100
(Using only the public REST API)
It took an afternoon
Mirror an RT instance into SD
Share it with your peers using prophet
Sync changes back from your peers to RT
Supports Comments and Attachments
Wrote an RT Replica for SD
101
(Using only the public REST API)
...and one for Hiveminder
102
I can sync my bugs with RT or Hiveminder
103
Actually, it’s better
104
I can sync between RT and Hiveminder
105
I can sync between two different RTs, too
106
• Trac
• Launchpad
• Google Code
• SourceForge
• Bugzilla
• Jira
• GForge
• debbugs
• GNATS
• todo.txt
• Lighthouse
• Redmine
• FogBugz
• What else?
We need more replica definitions:
107
What else can you use Prophet for?
108
All your “social” databases
109
•CRM
•Bug tracking
•Sales orders
•Phone book
•Blog
•Trading Card Database
•Ideas?
All the databases you want while offline.
110
How about a P2P BBS?
Prophet doesn’t need a server.
You can sync over sneakernet.
“Private” Social Networks
111
A look inside Prophet
112
Anatomy of a Prophet Replica
113
The bits and pieces
Database UUID
Replica UUID
Record Store
Changeset Store
Resolution Database
Configuration metadata
114
The Record Store
Stores individual records by type
Not guaranteed to have all old versions
115
The Changeset Store
Stores every change to a set of records
Guaranteed to have all old changesets
Replaying all changesets will create an exact clone of the replica
116
Replica Backends
117
Filesystem
Readable
Flat files
Compact
Fast
(Not yet fully atomic)
118
HTTP
Designed to let you “publish” databases
Flat-files, Currently read-only.
Same format as the filesystem replica type.
119
Subversion (DEPRECATED)
Slow
Steady
Robust
Supports remote sync
Requires Subversion Perl Bindings
120
Backends are pluggable!
The filesystem is cheap and easy
The filesystem is portable
Help us write new backends:
CouchDB, SQLite, MySQL, Postgres, S3, AppEngine, $YOUR_FAVORITE_DB
121
Prophet is designed to sync with “other” databases and systems
They don’t need to support all of Prophet’s features - Prophet knows how to interpret mumbo-jumbo from the Cloud
Foreign Replicas will usually be app specific
All current examples are for SD
Foreign Replicas
122
Synchronization
123
Publish
Serialize and export all of a replica's resolutions and changesets
124
Pull
Integrate unseen resolutions and then unseen changesets from a replica
125
Push
Integrate new resolutions and changesets into a replica
126
Conflicts
127
Figures out the best resolution
“Nullifies” the conflict so the changeset can be cleanly integrated
Integrates the conflicting changeset
Records the resolution as a new changeset
Records the resolution decision in the resolution database
Resolving Conflicts
128
Prophet has clever ways to figure out the best resolution.
If there are previous resolutions for the same conflict and a majority agree, use that
If the merger has specified a “prefer this side” choice, use that
Prompt the user to make a decision, giving them info about previous decisions for this conflict
“The Best Resolution”
129
Scaling
130
Scaling to giant clusters is boring
(Can I play the “They’re not Green” card here?)
Scales to many weakly coonnected peers
You are not Google.
Does anyone here work for Google?
Current target is databases of O(50k) records
How does it scale?
131
We have a political agenda.
Cloud computing is not Open Source.
APIs for “export” are not good enough.
You should always have full control.
You probably don’t need to store 10 billion records in one database.
Why not, then?
132
Do you have 10 billion bugs, customer contacts
or sales orders?
133
That said, we'd love to see a scalable, high
performance prophet replica store
134
Getting Involved
135
Project Status
Simple, well-defined Perl API
RESTy web API (with microserver)
Fast, lightweight backend
Small, active dev community
Great test coverage
...less than great documentation coverage
136
Better ergonomics
Improved search and indexing
(Including full-text indexing)
Client libraries for other languages
Proper security model
More apps
Our Plans
137
Prophet
6937 lines of code and doc
1952 lines of tests
sd
2121 lines of code and doc
973 lines of tests
Codebase
138
Prophet is very young
Prophet designed in April
Prophet core implemented in April
SD designed in April
SD built in June and July
139
We need your help!
Kick-ass functional and text indexing
Backend data store improvements
Slick GUIs for syncing
More Foreign Replicas for SD
Documentation improvements
A clever logo
New applications
140
Prophet
http://syncwith.us/prophet/download
SD
http://syncwith.us/sd/download
Getting Prophet
141