201511 - Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Boston
-
Upload
gabriele-columbro -
Category
Technology
-
view
11.700 -
download
3
Transcript of 201511 - Alfresco Day - Platform Update and Roadmap - Gabriele Columbro - Boston
Alfresco Platform Update & Roadmap
Gabriele ColumbroSr. Product Manager, API / SDK / Platform
@mindthegabz
Alfresco Day Boston, November 2015
2
A look at today’s presentation agenda
Alfresco Platform Roadmap
Platform VisionWhat problems does the Alfresco
Platform helps and will help solve and for which personas
Platform projectsOverview of current ongoing platform initiatives
3
Extreme ScalabilityProving Alfresco at Cloud scale and
providing tools & reference point for real life implementations
Upgrade Task ForceSimplification of the customer maintenance lifecycle, in response of overwhelming customer validation
Share separationEffects of the Share separation and Core
platform modularization
Dev Platform & SDK Consolidate & Expand APIs / extension points to ensure high longevity Alfresco application development and greatly simplify SDK based Alfresco development
4
A look at today’s presentation agenda
Alfresco Platform Roadmap
What’s in it & when?When can you expect release of the
ongoing projects, what are backlog and horizon 2 projects
Conclusions and QARecap of the platform lifecycle makeover and open discussion
Vision for the Alfresco PlatformObjectives and guiding forces driving development of the Alfresco Platform
6
Build an open and scalable platform to power the rapid development and deployment of hybrid content centric applications in the Alfresco extended ecosystem
Platform Vision
7
Technology & market innovation driving Alfresco Platform strategy
Driving Forces
Hybrid ECM Innovate at Cloud speed Think Big Customer drivenPlatform and solutions should be able to run on premise, on cloud
or both
Deliver innovation to the on premise and cloud products with agility typical of pure SaaS players
Enable the scaling of people, processes and products
Customer feedback, research, validation, pretotyping at the core of ideation and decision making process
8
Key platform improvements research has uncovered
Customer data Driven
Backwards CompatibilityJava Modules
Improve content reindexing
BackwardsCompatibility
Share Extensions
ModulesIsolation
In place upgrade SP & HF
Lack of Zero downtime
upgrades
BackwardsCompatibility
Remote Applications
#3 #1#5 #2 #4#6 #7
Alfresco Platform projectsOngoing developments in the Alfresco Platform
10
Platform InvestmentsAn end to end Platform lifecycle makeover
DeploymentTesting Release Integration Maintenance
Standard Dev Env
Share Separation
API BCKs
Xtreme scalability
Share separation API compatibility
JAR modules
Modules isolation
Dev Docs / Samples
Solr Sharding
Suite installers
In-place SP & HF
API Compatibility
Share separation
Development
11
Alfresco reaches the 1B document mark on AWS• 10 Alfresco 5.1 nodes, 20 Solr 4 nodes in Sharding mode, 1 Aurora DB • Loaded 1B documents at 1000 docs / sec – 86M per day• Indexed 1B documents in 5 days – > 2000 docs / sec• No degradation in ingestion or content access upon content growth• Tested up to 500 Share concurrent users and 200 CMIS concurrent sessions
“We applaud Alfresco’s ability to leverage Amazon Aurora to address business requirements of the
modern digital enterprise, and enable a more agile and cost-effective content deployments.”
Anurag Gupta, Vice President, Database Services, Amazon Web Services, Inc. – 2015 October 6th
Highlights
Press release
12
Bechmark
Results Introducing the Extreme Scalability benchmark
• Repository Layout– 10k sites; 2 levels deep; 10 folders per level; 1000 files per folder– 100 kb avg plain text files with varying content complexity (for indexing purpose)– Default content model
• Scenarios– Share interaction (Enterprise Collaboration)
• First focused on the Repository, no Search• Then with Search, including Solr4 Sharding
– CMIS interaction (Headless Content Platform)• Transactional Metadata Query testing
• AWS Fully cloud environment (provisioned by chef-alfresco)– Alfresco 5.1 + Share 5.1 (development code, unreleased)– AWS EC2 / Aurora (Mysql compatible and Alfresco supported)– Ephemeral for Index storage / EBS for content storage (spoofed)
13
Cloud stack1.2B documents execution environment
UI Test x 20 m3.2xlarge Simulate 500 Users• Selenium / Firefox• 1 hour constant load• 10 sec think time
UI Test UI Test
Alfresco Alfresco Alfresco x 10 c3.2xlarge Alfresco Repo and Share
Solr x 20 m3.2xlarge Solr Solr
Aurora x 1 db.r3.xlarge
ELB
Sharded Solr 4
sites folders files transactions dbSize GB
10,804 1,168,206 1,168,206,000 15,475,064 3,185
EBSIngestion (in place)
EBS
14
Cloud scale testing
How did we test it?
• Repository Loaded using bm-dataload (with file spoofing option)
• 1B document benchmark AKA BM-0004 - Testing Repository Limits base on bm-share
• Scalability & Sizing testing on Enterprise Collaboration Scenario (bm-share) and Headless Content Platform (bm-cmis)
https://wiki.alfresco.com/wiki/Benchmark_Testing_with_Alfrescohttps://github.com/derekhulley/alfresco-benchmark
Benchmark Server
Tomcat 7
Rest API
MongoDBConfig Data
Services
MongoDBTest Data
UI
Benchmark Driver (xN)Benchmark Driver (xN)Benchmark Driver
Tomcat 7 Extras(Selenium)
Servers / APIs Servers / APIs
Load Balancer
Servers / APIs
Test
ServicesRest API
15
Benchmark
Results Getting to 1B documents
• Ingestion– With 10 nodes, 1000 documents / second (3 million per hour, 86M per day, 12 days for the full
repo) – spoofed content comparable to in place BFSIT loading – Load rate consistent even beyond 1B documents– Throughput grew linearly by adding ingestion nodes (100 docs / sec per node)– Adding additional loading nodes likely to raise ingestion throughput – as Aurora was only at 50%
CPU• Indexing
– Index distributed over 20 Alfresco Index Servers, sharding on ACLs (good for site based repository), with Alfresco dedicated tracking instance
– Each shard holds approx (in excess of) 50M nodes – Re-Indexing completed in about 5 days (each node tracks a sub-set of the 1B)– Dynamic sharding autoconfiguration (5.1 feature)
NOTE: requires Alfresco tracking nodes to be in the cluster
16
The following information is based on an development version of the unreleased Alfresco 5.1.Performance data is provisional and subtle to change based on testing the final Alfresco 5.1 release.
1717
Bechmark Results
Testing Alfresco on 1b docs
• Repository Only (500 Share users) test– Sub-second login times and good, linear responses for other actions
• Open Library: 4.5s / Page Results: 1s / Navigate to Site: 2.3– CPU loads:
• Database: 8-10% / Alfresco (each of 10 nodes): 25-30%• Shows room for growth up to 1000 concurrent users
• Repository + Search (100 Share users)– Metadata and full text search ~ 5s (on 1B documents)– 1.2 searches / sec hitting the 20 shards
• TMDQ queries (database only, no index) via CMIS– IN_FOLDER (sorted, limited) ~ 160ms at CMIS interface– CMIS:NAME (=, LIKE) ~ 20ms at CMIS interface
18
Recomm
endations
Lessons Learned
• A single Alfresco repository can grow to 1B documents on AWS without notable issues, especially with a scalable DB like AWS Aurora
• As for the index, Shard, Shard, Shard– Shard to cope with content growth
• Single Solr instance tuned for 50M docs / 32GB– Shard for performance / SLA
• Improve performance of search on large scale repositories to hit SLA requirements
– Shard for operational reasons• Improve reindexing time (1B docs re-index in 5 days with 20 shards)
NOTE: Sharding has a cost of results post-ranking. Use reasonably.
• No indications of any size-related bottlenecks with 1.1 Billion Documents
• DB Indexes optimized (no index scans) even at a 3.2TB Aurora DB
19
5.1Key Alfresco 5.1 scalability items to look forward to
• Alfresco Solr Sharding– On ACL– Tested up to 80M documents per shard and 20 shards
• Improved Transactional metadata queries– Boolean, Double and OR construct
• Easy deployment and scaling in AWS using provisioning technologies like chef-alfresco• Alfresco support for Amazon Aurora (also available in Alfresco 5.0)• Updated field collaterals
– Scalability Blueprint for Alfresco 5.1– Sizing Guide for Alfresco 5.1– AWS Reference architecture, implementation guide and CloudFormation template for Alfresco
5.0 and 5.1
2020
https://www.alfresco.com/blogs/how-alfresco-powered-a-1-2-billion-document-deployment-on-amazon-web-services/
Check out all the details!https://www.alfresco.com/blogs/how-alfresco-powered-a-1-2-billion-document-deployment-on-amazon-web-services/
21
Enabling a seamless maintenance for Alfresco
Upgrade Task Force
1. In place application of SP & HF (not major and minor upgrades, for now)
2. Separation of Share and Platform releases for independent consumption (and definition of a clear compatibility matrix)
3. Consolidation of Public API Lifecycle to ensure high longevity customizations (no need for re-test)
NOTE: Not tied to Alfresco 5.1, the update assistant will be released for earlier versions
22
Effects to the product lifecycle
Share / Platform separation
Platform and Share
can be built
and developed
independently
Dev Release Install
Platform and Share
can be released
independently (or
together)
Maintain
Suite and
independent
installers for Alfresco
and Share
Consume new
version of Platform
& Share
independently
23
Modularizing the platformBreaking the monolith
Alfresco Platform
Core set of functionalities exposing
extension points including Java and
ReST APIs
Transformation servicesCan be scaled independently using the
transformation server or in MM for
video transformations
Share services (New!) Subset of platform functionalities now
extracted in a separate module (AMP)
following the Share release lifecycle
Search servicesCan be scaled independently as it relies on
Solr4 standalone (with Replication and
Sharding support)
24
Share separation takeaways
1. Share (only) releases will now contain a share-services.amp which contains Share specific backing APIs
2. Platform (only) released will no longer contain Share specific Java services
3. All-in-one installers (Share + Alfresco + AMPs) will be produced
4. Compatibility between Share & Alfresco is driven by the Java (not ReST) APIs compatibility policy (wait for it…in the next slides!)
5. Expect more frequent Share releases on prem (quarterly) and on cloud
What you need to know!
25
Alfresco for the Developers
1. Comprehensive set of content management & workflow Java and ReST API
2. Modular UI framework to custom business solutions
3. De facto standard based and enterprise ready SDKs for web and mobile development
What’s great about Alfresco Dev Platform
26
Multiple ways Alfresco helps you achieve your custom solutions
The Alfresco Developer conundrum
Compatibility
Dev EnvSamples & Docs
Standalone AppPlatform Extension
Documentation
SamplesCompatibility
Developer Environment
Share ExtensionAikau basedCompatibility
Dev Env
Samples
Surf based
ReST - StrategicJava - Tactical
27
Consolidate & ExtendPlatform execution strategy
Consolidate
Alfresco.next
Discover
Go ahead and replace it with
your own text.
Design
Go ahead and replace it with
your own text.
Consolidate
Alfresco.next will deliver on
API fidelity and Dev Experience
Invest in an increasingly
complete and self-servicing
ReST APIs
Extend
Extend
Post Alfresco.next
28
Developer platform consolidation
1. Documentation of supported Platform, Share and ReST extension pointsMove old webscript ReST API to Limited Support Invest on the on new Alfresco ReST API V1Cleary identify and document supported Java and Share extension points
2. API lifecycle, support and Backward compatibility In process - Major version supportReST - Independently versioned and inherently backward compatible
3. Customer success driven tactical investments on the Java platform & modulesJAR simple module support (for Alfresco and Share)Physical isolation of modules without need to modify Alfresco (immutable)Share modules support and reporting
Ongoing activities targeting Alfresco.next
2929
docs.alfresco.com/5.0/concepts/dev-extensions-share-extension-points-introduction.htmlStatusLive for Share on Alfresco DocsWIP for Java and ReST
A glimpse on the improved Alfresco Dev Experience
Enterprise grade Developer docs
How can you helpSend feedback to me, [email protected] or viaAlfresco DEVPLAT project
30
31
So what about compatibility?
1. Major version for Platform and Share extensions (modules)Your custom module built on 5.1 Public API will work throughout the whole 5.xAlfresco modules can be compatible for a major version
2. ReST API version driven support for integrations (standalone apps)Not bound to the Alfresco versionClear rules for versioning of ReST APIsSupported until v+2 is released or 1y after v+1 is released (the earliest)
For internal and external Alfresco extensions and integrations
32
Alfresco SDKWhat’s out already
Alfresco SDK 2.1.0 - Compatible with 5.0, with hot reloading (Platform & Share)Alfresco SDK 2.1.1 - Multiple bug-fixes, backward compatible
Together with Alfresco nextFully supported, easily forkable and complete set of samples on alfresco-sdk-samples (in Github)Improved hot reloadingCustomer value driven prioritization of Public Github issues. Request enhancements at https://github.com/Alfresco/alfresco-sdk/issues
Making Alfresco development even more productive, safe and fun
What & when?An outlook to our target Platform release plan
34
Information provided in the following slide is roadmap information and therefore subtle to change in subject, timelines, context.
35
Platform release targets
1. Target: Alfresco.next —> Early 2016Both Platform and ShareIncludes all major Developer Platform improvementsSolr sharding and scalability collateralsFull revamp of developer documentation
2. Post Alfresco.next —> 2016Share can follow a more frequent release scheduleStrategic improvements in the ReST API (vs Java), functionally and non functionallyMore modularization, for agility and scalability purposes
Conclusions and Q/A
37
Take-aways1. API Lifecycle
Fundamental to avoid dependency hellClear, documented, easy to use and supported extension pointsKey factor to drive seamless upgrades
2. Developer PlatformJar modulesShare modules support and reporting
3. Extreme ScalabilitySolr ShardingMDQ improvementsNew collaterals for sizing, scalability and reference architectures
3. Share lifecycle separation4. Upgrade task force
What you really need to remember about today’s session
38
WHAT WHY WHERE WHEN WHO HOW
Any Question ???Feel free to send your feedback at [email protected] or @mindthegabz