Open Source at Salesforce.com

44
How Open Source Software Embiggens salesforce.com What We Use, And What We Contribute To Ian Varley, salesforce.com, Principal Member of Technical Staff @thefutureian

description

The Salesforce core stack embraces Open Source Software. Join one of our leading engineers and learn about how we tackle enterprise-grade challenges for our customers using Hadoop, HBase, Jetty, Solr, and Apache QPID. We'll also discuss the process of opening internal libraries like Aura and Phoenix, as well as Salesforce's place in the larger Open Source community.

Transcript of Open Source at Salesforce.com

Page 1: Open Source at Salesforce.com

How Open Source Software Embiggens salesforce.comWhat We Use, And What We Contribute To

Ian Varley, salesforce.com, Principal Member of Technical Staff@thefutureian

Page 2: Open Source at Salesforce.com

Safe HarborSafe harbor statement under the Private Securities Litigation Reform Act of 1995: This presentation may contain forward-looking statements that involve risks, uncertainties, and assumptions. If any such uncertainties materialize or if any of the assumptions proves incorrect, the results of salesforce.com, inc. could differ materially from the results expressed or implied by the forward-looking statements we make. All statements other than statements of historical fact could be deemed forward-looking, including any projections of product or service availability, subscriber growth, earnings, revenues, or other financial items and any statements regarding strategies or plans of management for future operations, statements of belief, any statements concerning new, planned, or upgraded services or technology developments and customer contracts or use of our services. The risks and uncertainties referred to above include – but are not limited to – risks associated with developing and delivering new functionality for our service, new products and services, our new business model, our past operating losses, possible fluctuations in our operating results and rate of growth, interruptions or delays in our Web hosting, breach of our security measures, the outcome of any litigation, risks associated with completed and any possible mergers and acquisitions, the immature market in which we operate, our relatively limited operating history, our ability to expand, retain, and motivate our employees and manage our growth, new releases of our service and successful customer deployment, our limited history reselling non-salesforce.com products, and utilization and selling to larger enterprise customers. Further information on potential factors that could affect the financial results of salesforce.com, inc. is included in our annual report on Form 10-K for the most recent fiscal year and in our quarterly report on Form 10-Q for the most recent fiscal quarter. These documents and others containing important disclosures are available on the SEC Filings section of the Investor Information section of our Web site. Any unreleased services or features referenced in this or other presentations, press releases or public statements are not currently available and may not be delivered on time or at all. Customers who purchase our services should make the purchase decisions based upon features that are currently available. Salesforce.com, inc. assumes no obligation and does not intend to update these forward-looking statements.

Page 4: Open Source at Salesforce.com

Show of hands ...● Use OSS?● Contribute to OSS?● Write their own OSS projects?

Page 5: Open Source at Salesforce.com

Developers at salesforce.com spend all day in open source

software.

Page 6: Open Source at Salesforce.com

salesforce.com engineers work on an OSS stack ...▪ Linux (Ubuntu, RHL)▪ Java▪ Eclipse (+ IntelliJ, vim, emacs)▪ Guava, Apache Commons, more▪ JUnit, Mockito, Selenium▪ Git (+ p4)▪ Memcached

Page 7: Open Source at Salesforce.com

What’s so great about open source software?

Page 8: Open Source at Salesforce.com

A rising tide ... … lifts all boats.

Page 9: Open Source at Salesforce.com

It’s a win-win situation.▪Everyone gets more out than they put in▪You have control over your own destiny▪You can attract the industry’s best minds

● The smartest devs seem to gravitate towards open source● So if you raised your hand before, give yourself a pat on the back.

Page 10: Open Source at Salesforce.com

So,what do we use?

Page 11: Open Source at Salesforce.com

Servlet ContainerServlet containers handle routing HTTP requests to code.

▪Started w/ commercial product▪Feature: “steal” work from overloaded servers▪(Code name: Hamburglar)

But! Show stopper bug, and no way to fix it ...

Page 12: Open Source at Salesforce.com

Solution: Jetty▪http://www.eclipse.org/jetty/▪Year-long migration process▪Tricky with 10+ years of legacy code!▪Now running Jetty (almost) everywhere.

Page 13: Open Source at Salesforce.com

Search IndexingIndexer takes text (e.g. chatter posts, etc.), makes it searchable.▪Original implementation: Lucene (forked)▪But, scale keeps increasing!▪Bottleneck: single-writer QFS on a SAN▪Needed solution to scale horizontally

Page 14: Open Source at Salesforce.com

Solution: Solr▪Horizontally scalable, REST interface▪Query / index on same host, no more SAN▪New features, core library is (latest) Lucene

We’ve also contributed some small fixes, and contracted a big fix to allow handling indexers with many cores (10K+!).

Page 15: Open Source at Salesforce.com

Contributing is awin / win.

Page 16: Open Source at Salesforce.com

Message QueueDecouple calling code from its execution.▪Originally: 10-15 devs had rolled their own▪Centralized on a transactional queue (Vijay)▪Commercial product, deeply coupled to DB▪OK until: the “600” error. 3 years of back and forth.▪Eventually rewrote our layer to work around it!▪Scale problems: 50 -> 500 queues▪CPU contention at head of queue

Page 17: Open Source at Salesforce.com

Solution: QPID▪Apache project, good reputation▪Separate tier from the DB▪Ran into bugs … ▪… and fixed them.▪40% memory savings on client (QPID-4873; thanks Helen Kwong & Brian Toal)

Page 18: Open Source at Salesforce.com

Open source lets us bring our expertsto help everybody.

Page 19: Open Source at Salesforce.com

Build: AntBuild tools help get you from “code written” to “code running”.▪Used Apache Ant for years▪But, as the # of devs has grown …▪scale and maintainability problems.

Page 20: Open Source at Salesforce.com

Solution: Maven▪Moving core build to Apache Maven▪Goal is a more modular and decoupled build structure▪Declarative dependencies FTW▪OSGi: Apache Felix

Page 21: Open Source at Salesforce.com

Deployment: Home GrownDeployment tools let you get code out to servers.▪Salesforce.com has always used home-grown tool, “ReleaseRunner”▪Required for tight security model (no passwordless ssh, root)▪But as we scale out, manual methods aren’t cutting it

Page 22: Open Source at Salesforce.com

Solution: Puppet, Salt, Razor, RundeckGet code out to lots of servers with little manual involvement.▪Razor: automated machine inventory▪Puppet: deployment of bits and configuration▪Salt, Rundeck: service orchestration for restarts

All of this still very much WIP; nobody else does it w/ this level of security

Page 23: Open Source at Salesforce.com

Batch Processing▪Salesforce == RDBMS▪No great approach for batch processing▪Especially on sets that don’t fit in memory▪Working outside relational model very hard

Page 24: Open Source at Salesforce.com

Solution: HadoopMap/Reduce: ship computations to your data instead vice versa▪Walter Macklem (Platform CTO); Codename: Gridforce▪+HDFS (distributed file storage)▪+Pig (a higher level language)▪Features: recommendations, search relevance, machine learning▪Log export pilot ...

(Ask your CSR/CSM/AM to get nominated for the pilot!)

Page 25: Open Source at Salesforce.com

Big DataRelational databases are powerful ...▪But, started looking at cost model▪Lower than average (multi-tenancy)▪Model is so rich, prohibitive for really large data▪RDBMS has strict scalability limits per object▪Hard to scale out because, runs on big iron

What if we could store vast numbers of records, but with fewer capabilities and assumptions? Scale horizontally, but with the same safety guarantees?

Page 26: Open Source at Salesforce.com

Big Data: HBaseHorizontally scalable NoSQL database.▪Fewer capabilities (no joins, transactions)▪Scales by adding machines▪Fault tolerant (on HDFS)▪Features? Initially, audit & compliance, event tracking▪Eventually, a lot more: really big objects▪Got a lot of field history? Join the FHR retention pilot! (Talk to you CSM)

This is my team, so I could talk for hours. But go see Lars Hofhansl’s talk!

Page 27: Open Source at Salesforce.com

OK, that’s cool. But, does Salesforce contribute new projects?

Page 28: Open Source at Salesforce.com

Historically: no, not many.But, this is changing.

Page 29: Open Source at Salesforce.com

Aura: UI FrameworkBasis for new generation of Salesforce UI▪High performance client-server architecture▪Event-driven, MVC architecture▪https://github.com/forcedotcom/aura

Page 30: Open Source at Salesforce.com

Phoenix: a SQL Skin for HBase“We put the SQL back in NoSQL”▪A proper subset of SQL▪Familiar interface, scalable storage▪Unlike Hive, uses the HBase client API▪Blazing fast; queries in milliseconds▪Tons of contribution since we opened it▪https://github.com/forcedotcom/phoenix

Page 32: Open Source at Salesforce.com

Lots more!So far, we’ve only been talking about Salesforce core.▪Many salesforce.com companies use tons of Open Source:•Heroku - https://github.com/heroku•Radian6, Data.com, ExactTarget - you name it, we probably use it somewhere

▪And lots of open source stuff on the platform, too!•http://boards.developerforce.com/t5/Salesforce-Labs-Open-Source/bd-p/labs

•You can search github for Apex & Salesforce

Page 33: Open Source at Salesforce.com

Salesforce.com isn’t just an OSS user.We’re an OSS pusher.

Page 34: Open Source at Salesforce.com

Committers on dozens of big projectsSalesforce.com actively supports a lot of people who primarily contribute to open source projects (not just a side thing).▪ Postgres: Tom Lane (Project Lead)▪ Ruby: Matz (Project Lead)▪ Maven: Jason Van Zyl (Project Lead)▪ HBase: Lars Hofhansl (PMC, release manager); Jesse Yates▪ Phoenix: James Taylor (Project Lead)▪ Aura: Doug Chasman (Project Lead)▪ Pig: Prashant Kommeredi

Page 35: Open Source at Salesforce.com

Is Open Source right for everything?No.

Page 36: Open Source at Salesforce.com

It’s great for ...▪Core components▪Databases▪Common algorithms▪Reusable UI libraries & abstractions

And any case where “the source isn’t the secret sauce”.

Page 37: Open Source at Salesforce.com

It’s not great for ...▪Code entangled with your business model▪Code you didn’t write with a plan to open up▪Software that’s “all things to all people”▪Getting other people to do your work

But, these are kind of anti-patterns anyway, right … ?

Page 38: Open Source at Salesforce.com

Most return on investment is from open sourcing “the interesting bits”,

rather than the whole stack.

Page 39: Open Source at Salesforce.com

And embracing the Open Source approach, particularly in the last 3 years, has been a sea change.

Page 40: Open Source at Salesforce.com

In conclusion ...

Page 41: Open Source at Salesforce.com

In contributing, we all gain.Look for more OSS involvement from salesforce.com

in the future!

Page 43: Open Source at Salesforce.com

We want to hear from YOU!

Please take a moment to complete our session survey

Surveys can be found in the “My Agenda” portion of the Dreamforce app

Page 44: Open Source at Salesforce.com