© 2016 IBM CorporationHadoop Summit – Dublin 2016
Surviving the Hadoop Revolution
Adriana Zubiri ([email protected])Program Director, AnalyticsScott C. Gray ([email protected])Senior Architect and STSM, Big SQL, Big Data Open Source
© 2016 IBM Corporation2 Hadoop Summit – Dublin – April 2016
Who Are We?
Adriana ZubiriProgram Director, IBM AnalyticsFormer development lead, IBM Big SQLData Warehouse Performance expertDatabase engine developer (DB2)
Scott C. GrayLead Architect of IBM Open Platform for Apache HadoopOn the ODPi (http://odpi.org) Technical Steering CommitteeFormer lead architect of IBM Big SQLOpen source developer (jsqsh and jline2)
Disclaimer: The perspective of this presentation is shaped by our backgrounds & experiences and it is our own and not that of IBM
© 2016 IBM Corporation3 Hadoop Summit – Dublin – April 2016
AgendaThe business impact of open source
- Once upon a time- Then the world changed- What customers want … or they think they want…- Adapt or else…
Software Engineering: a new era- Traditional vs open development model- Integrating into the community- Changing processes: getting agile
What’s Next?
© 2016 IBM Corporation4 Hadoop Summit – Dublin – April 2016
Proprietary software- own the software end to end
Major releases years apart- focus on high quality and Service Level
agreements (SLAs)
Focus on patents - protect from the competition
Competition was well defined- relatively small number of players
Customers with large IT budgets- multi-year contracts, large profit margins
© 2016 IBM Corporation5 Hadoop Summit – Dublin – April 2016
Setting up the stage…. …for Hadoop
Data volume & variety explosion
Expensive storage
Budget restrictions due to economic climate
Open source getting popularity at enterprises
Large data sets
Cheap commodity hardware
Easy to Scale
Resilient
Open & Free software
© 2016 IBM Corporation6 Hadoop Summit – Dublin – April 2016
- To move to open source• Open software vs. vendor lock-in
- Inexpensive software• Unwilling to run unsupported software• Free software is not free: deployment costs
- Bleeding edge and stable high quality software
What customers want…
© 2016 IBM Corporation7 Hadoop Summit – Dublin – April 2016
In the meantime… at many Corporate HQs….
No! It will take part of our business!
Yes!We risk becoming irrelevant!!
Do we embrace and support?
© 2016 IBM Corporation8 Hadoop Summit – Dublin – April 2016
What Business Model?
Open Source Software Pure Plays Revenue based on maintenance and Support
Value Add Plays They leverage Open Source Software Proprietary value on top of Open Source and not just
depend on maintenance revenue Software as a Service companies are deploying this
strategy
© 2016 IBM Corporation9 Hadoop Summit – Dublin – April 2016
Products that are not significantly more advanced that what the open source community has
Products where the community is very active and will catch up soon
Already non-profitable products
For areas where they need infrastructure for value adds up the chain
To improve customer and community perception
What would companies might open source their code?
© 2016 IBM Corporation10 Hadoop Summit – Dublin – April 2016
Giving up control - Don’t own the roadmap of the project
Too complex and large- Low chance for the community to adopt
It’s expensive!- Need to invest in being the custodian for the code where
you don’t make money out of- Need to prepare the code for open source, remove
proprietary IP, write docs, go through code clearance Corporate Politics
- Open source projects have corporate leaders and sometime corporate politics are involved when someone wants to contribute
What would companies might still NOT open source their code?
© 2016 IBM Corporation11 Hadoop Summit – Dublin – April 2016
Number of players in some spaces is unprecedented (e.g. SQL)- Barrier of entry is low- Small and open is seen as attractive- Everyone can sit at the table (vs traditional vendors)
The new rules of roadmaps- We compete in features delivered and planned features
Traditional benchmarking: the right strategy to show we are faster?- The fast pace of new releases make benchmarks obsolete almost when
published
Keeping customer trust- Sometimes our solutions are faster and more reliable but … the feature exists in
open source- When just good enough is enough
Marketing & Sales: Who do we compete with?
© 2016 IBM Corporation12 Hadoop Summit – Dublin – April 2016
Evolving Software EngineeringWaterfall to Agile
andLearning to Let Go
© 2016 IBM Corporation13 Hadoop Summit – Dublin – April 2016
Once Upon a Time – There Was Lots of (Slow) Process
Development involved a lot of water-fall style process- Gather requirements from the customer- Prioritize the requirements- Develop and publish a roadmap- Develop a project plan for each release- Develop and document- QA- Release- Rake in the money!
- Repeat
So what you do is you take the specifications from the
customers and you bring them down to the software
engineers?
© 2016 IBM Corporation14 Hadoop Summit – Dublin – April 2016
Once Upon a Time – We Were Control Freaks!
With all that process also came a lot of control:- What features will be available- When features will be made available- How features will be built- Quality and testing of all of the features- Focus on usability and not just function- Compatibility testing and assurances- Thorough knowledge of the code base- Detailed documentation of functionality- Control over code style, documentation, interfaces- Etc, etc. etc,.
© 2016 IBM Corporation15 Hadoop Summit – Dublin – April 2016
The Results of Process and Control?
Slow, methodical pace…- Difficult to “stay pace” with this rapidly evolving
open source world, but....- Enterprise customers tend like the stability and
predictable evolution- Integrating applications have time to stabilize
(Reasonably) strong assurances to customers Complete responsibility for quality (both the good and bad) Freedom and autonomy
© 2016 IBM Corporation16 Hadoop Summit – Dublin – April 2016
Control In The New Age: Learning to Let Go Adopting Free and Open Source Software (FOSS)
as a foundation means letting go of much the control we so cherished!
The community controls a lot- Features and functionality- Release timeline and roadmap- Quality, stability, and performance- Backwards compatibility- Quality of documentation
At times one or more of above is lacking- And not in our control!
© 2016 IBM Corporation17 Hadoop Summit – Dublin – April 2016
It’s OPEN, dummy. Get involved.
Fix it.Improve it.
The community is YOU!
Open Source
© 2016 IBM Corporation19 Hadoop Summit – Dublin – April 2016
Entering the Community Some developers have been involved in ASF Most came from closed source
- ASF process feel burdensome and unfamiliar- Not clear/comfortable how to work openly
Learning to work with the community- Filing JIRA’s and waiting for something to happen- Getting attention from the community- Accepting input and criticism- Building committers- Designing in the open- Being patient
© 2016 IBM Corporation20 Hadoop Summit – Dublin – April 2016
Community Challenges
Like your local community, there are conflicts, disputes, and other challenges
Sometimes the “community” is small, inclusive- The founding team usually maintains a lot
of control- May be extraordinarily difficult to become
committer or have code accepted- They may be (rightfully) very concerned about quality and design
Even an “open” project has its fiefdoms- Even as a committer you can’t just “force” changes- Different areas have different owners/authors- Conflicts arise if you just commit without review
© 2016 IBM Corporation21 Hadoop Summit – Dublin – April 2016
Closed source has these problems too!but working directly with people means they can be quickly solved by
Meetings & Leaders Hand-to-Hand Combat
© 2016 IBM Corporation22 Hadoop Summit – Dublin – April 2016
Evolving to Agile
We used to deliver software on feature boundaries- A release was “done” when a certain set of features were completed
The open development model mandates an AGILE strategy- We could no longer plan when a feature might be added- We can control the time to implement, but not the time to adoption- We could no longer publish detailed time based roadmaps
• Sales, marketing, product management HATE this!- But, agile paid off by speeding up our rate of
release to the market and response to customerdemands
© 2016 IBM Corporation23 Hadoop Summit – Dublin – April 2016
Policing & Herding Cats: Many Projects With Many Unknowns
Nowhere can FOSS agility prove more challengingthan maintaining a Hadoop distribution!
Bringing together many different projects, each:- Evolving at it’s own pace- With different degrees of stability and docs- Different levels of maturity and security- Varied user interfaces- May include “preview” features that that customers will invariably use- Different levels of compatibility with the others- Varied levels of backwards compatibility with prior releases
We are committed to working with projects to address as much as we can- It is truly challenging though!
© 2016 IBM Corporation24 Hadoop Summit – Dublin – April 2016
So what’s next?
Hadoop and Open source has forced an evolutionof traditional software business
Continues pushing industry to innovate to stay relevant…to survive- Pushing new areas of research- Identifying and defining new markets- Pushing technological advances- Continue to increase the rate of
innovation
Top Related