SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service...

34
SilverLining

Transcript of SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service...

Page 1: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

SilverLining

Page 2: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Stuff we're covering

• Hardware infrastructure and scaling• Cloud platform as a service • The SilverLining Project

Page 3: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Some context

• We work at a university• Funding based on projects• Biodiversity web apps and APIs• Focus on software (not hardware)

Page 4: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Infrastructure

• Applications depend on infrastructure• Infrastructure that "just works" is expensive• More money for infrastructure means less money for

application development• Degenerates without long-term funding• Unreliability is bad for applications • Increasingly bad user experience over time

Page 5: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

• $1.6M USD total budget to 17 institutions• $245k  USD (30.6% of direct costs) for infrastructure

Page 6: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

• $1.6M USD total budget to 17 institutions• $245k  USD (30.6% of direct costs) for infrastructure• $100k USD (12.6% of direct costs) for core application

developmento DiGIR provider, DiGIR portal

Page 7: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

MaNIS, ORNIS, HerpNet, FishNet 

• $7.6M USD combined budgets, 71 institutions• $196k USD annual operating cost

Page 8: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

MaNIS, ORNIS, HerpNet, FishNet 

• $7.6M USD combined budgets, 71 institutions• $196k USD annual operating cost• $179k USD (92%) for infrastructure

Page 9: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Infrastructure as a Problem (IaaP)

Page 10: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Infrastructure as a Problem (IaaP)

• Unsustainable• Creates a barrier to innovation• And this is all before scaling

comes into play!

Page 11: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Scalability

"The ability for infrastructure to reliably handle heavy request

loads in a high performance way."

Page 12: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

IaaP at scale 

Page 13: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Scaling up

• Scale up vertically with a server upgrade • Scale out horizontally with more servers

Page 14: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Scaling up

Page 15: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Scaling DiGIR networksMaNIS, ORNIS, HerpNet, FishNet 

• ~85 million records • ~100 servers 

Page 16: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Scaling DiGIR networksMaNIS, ORNIS, HerpNet, FishNet 

• ~85 million records • ~100 servers 

s

Page 17: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Query: All records with a point

Page 18: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Response: Error: IO problem

Page 19: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

"Scaling is hard."- Alex Payne

Page 20: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

"Scaling is hard."- Alex Payne

al3x.net/2010/07/27/node.html

Page 21: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Scaling in the small

• Handling dozens or requests per second• Scaling up vertically is sufficient• Performance improvements are software related

al3x.net/2010/07/27/node.html

Page 22: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Scaling in the large

• Billions of requests per week (Google)• Millions of active users (Facebook)• Data centers worldwide with millions of servers

al3x.net/2010/07/27/node.html

Page 23: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Are we scaling large or small?

• GBIF ~220 million records• eBird ~2 million new records per month• Undigitized collections ~2.5 billion records 

Page 24: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Scaling in the "small-ish"

• We're at the brink!• IaaP is in the way, scaling is making it worse• Where's the silver lining in all of this?

Page 25: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Platform as a Service (PaaS)en.wikipedia.org/wiki/Platform_as_a_service

Conceptually quite simple:• Computing power over the Internet • No servers to maintain• Pay for use• Scales large (even if your application is small)• Provided by companies such as Amazon, Microsoft, Google

Page 26: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

SilverLiningsilver-lining.googlecode.com

• Experiments, metrics, prototypes (not products)• Picked Google App Engine• PaaS with biodiversity data• Simple Darwin Core• Bulk loading, storage• MapReduce - indexes, validation, statistics• Optimize for resource efficiency, search performance

Page 27: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Cost comparison

Total annual operating costs of vertebrate networks:• Current architecture: USD $195,600• Projected App Engine: USD $19,540 

Page 28: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Cost comparison

Total annual operating costs of vertebrate networks:• Current architecture: USD $195,600• Projected App Engine: USD $19,540 

Total cost for SilverLining work to date:• 50 cents

Page 29: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

App Enginecode.google.com/appengine

• Develop scalable web apps on Google's infrastructure• No servers or hardware to maintain and free quotas• Standards based Java and Python SDKs• IDE support for Eclipse, NetBeans, IntelliJ• Local development server • Integrated support for unit testing

Page 30: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

App Engine constraints

• Practical constraints for performance and scalability• The datastore is not a relational database • Query can only use inequality filters on 1 property• Fails: year >= 1980 and year <= 1982 and elevation > 10• Solution: Set membership queries

Page 31: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Set membership queries

• Before: year >= 1980 and year <= 1982 and elevation > 10• After: year "within 1 year" of 1981 and elevation > 10• List for "within 1 year" of 1980: [1979, 1980, 1981]

Page 32: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

Aggregation and synchronizationcode.google.com/p/pubsubhubbubcode.google.com/apis/feed/push

• Fast aggregation via API• Subscribe to changes at the source• Changes pushed automatically

Page 33: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

What's the end game?

• PaaS instead of IaaP • SaaS (software as a solution)• BaaS (biodiversity applications at scale)

Aaron [email protected]

John [email protected]

Page 34: SilverLining. Stuff we're covering Hardware infrastructure and scaling Cloud platform as a service The SilverLining Project.

What's the end game?

• PaaS instead of IaaP • SaaS (software as a solution)• BaaS (biodiversity applications at scale)

Any QaaC? (Questions as a challenge)

Aaron [email protected]

John [email protected]