Algolia - Hosted Search API
-
Upload
enterprisesearchmeetup -
Category
Technology
-
view
144 -
download
0
Transcript of Algolia - Hosted Search API
![Page 1: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/1.jpg)
Instant Search API
Build Unique Search Experiences
Sylvain UtardVP of Engineering
[email protected]@sylvainutard
Enterprise Search and Analytics
![Page 2: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/2.jpg)
@algolia
Who am I?5 years @ Exalead, leading the core-engine & NLP teams
• C++ • ExaScript (RIP) • Java
2 years @ Algolia, VP of Engineering • C++ • Ruby • Java • and 10+ other languages…
@sylvainutard
![Page 3: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/3.jpg)
@algolia
A hosted search API
![Page 4: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/4.jpg)
@algolia
A hosted search API
![Page 5: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/5.jpg)
@algolia
![Page 6: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/6.jpg)
@algolia
A hosted search API
Replies in milliseconds
![Page 7: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/7.jpg)
@algolia
A hosted search API
Replies in milliseconds
From anywhere
![Page 8: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/8.jpg)
@algolia
A hosted search API
Replies in milliseconds
From anywhere With intuitive relevance
![Page 9: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/9.jpg)
@algolia
Algolia Today
![Page 10: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/10.jpg)
@algolia
800+ customers in 80+ countries
Algolia Today
![Page 11: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/11.jpg)
@algolia
800+ customers in 80+ countries
40B+ Write operations per month
4B+ User-generated queries per monthAlgolia Today
![Page 12: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/12.jpg)
@algolia
Algolia Today
13 locations
800+ customers in 80+ countries
40B+ Write operations per month
4B+ User-generated queries per month
![Page 13: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/13.jpg)
@algolia
Performance is our DNA
![Page 14: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/14.jpg)
@algolia
Speed matters
Half a second delaycaused 20% drop in traffic
Every 100ms of latencycosts them 1% in sales
![Page 15: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/15.jpg)
@algolia
Behind the scene
![Page 16: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/16.jpg)
@algolia
Unique set of constraintsHigh volume of Read & Write operations
![Page 17: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/17.jpg)
@algolia
Unique set of constraintsHigh volume of Read & Write operations
High-availability
![Page 18: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/18.jpg)
@algolia
Unique set of constraintsHigh volume of Read & Write operations
High-availability
Worldwide data distribution
![Page 19: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/19.jpg)
@algolia
API Software StackStarted as a mobile offline SDK
Written in C++
Search code embedded in Nginx as a module
Indexing is done in a separate process
Two redis instances
![Page 20: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/20.jpg)
@algolia
API Hardware
Fast CPU (Xeon E5 >3.5GHz)
In Memory (128GB)
Backed by High-end SSD in Raid-0 (800GB)
Specific kernel settings
![Page 21: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/21.jpg)
@algolia
Scaling horizontally
Several clusters per location
A user is assigned to one master cluster
A user can be replicated to N replicate clusters
![Page 22: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/22.jpg)
@algolia
What is a cluster
Master-Master
Stream of writes via Consensus
At least 3 machines
![Page 23: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/23.jpg)
@algolia
A write in practice
One of the machines acceptthe write operation via the API (https)
/1/indexes/MyFirstIndex/batch
![Page 24: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/24.jpg)
@algolia
A write in practice
The file is saved on the three machinesas a temporary file
tmp1265
tmp7864
tmp2357
![Page 25: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/25.jpg)
@algolia
A write in practice
Launch the consensus by contactingthe RAFT master
startConsensus(tmp2357, tmp7864, tmp1265)
![Page 26: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/26.jpg)
@algolia
A write in practice
1 -Master send the commit order to all nodes
2- Each node returns the next job ID to master
3- If there is a majority the file is committed
![Page 27: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/27.jpg)
@algolia
A write in practice
Same job ID on all hosts
Send to slave replicate in parallel
Processed in parallel on all hostsjob42
job42
job42
![Page 28: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/28.jpg)
@algolia
In case one host is down
Continue to accept writes
The two other hosts keep jobs
Jobs are sequential, will catch up at restartjob42job42
![Page 29: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/29.jpg)
@algolia
Distribution
Replicate jobs, not the result
Send to all machines in parallel
Consistent with few seconds delay
![Page 30: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/30.jpg)
@algolia
High availability
Multi-regions in one location
![Page 31: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/31.jpg)
@algolia
High availability
13 fully independent locations
![Page 32: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/32.jpg)
@algolia
Network Optimisations
API usage moving from servers to browser and mobile apps
Get close to end users
![Page 33: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/33.jpg)
@algolia
Distributed Search Network - Worldwide Synchronization
![Page 34: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/34.jpg)
@algolia
Distributed Search Network - Worldwide Synchronization
![Page 35: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/35.jpg)
@algolia
• 13 locations = 25 datacenters • No ideal worldwide provider
• AWS is not in India, Eastern EU, Africa…
• Need to handle several providers
• Anticipate long deliveries / customs
• Keep as few providers as possible
Distributed Search Network - Worldwide Synchronization
![Page 36: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/36.jpg)
@algolia
DNS is key
Used to find the closest location
Several DNS providers
Good anycast network
![Page 37: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/37.jpg)
@algolia
API Clients
DNS health checks are not enough
Smart retry logic in all our API Clients
![Page 38: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/38.jpg)
@algolia
Analytics• What are my users searching for?
• Top search
• Top search without hits
• Top refinements
• From where do they search for?
![Page 39: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/39.jpg)
@algolia
![Page 40: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/40.jpg)
@algolia
![Page 41: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/41.jpg)
@algolia
Analytics
• Billions of user-generated queries per month
• As-you-type aggregation
• ~3 months retentions
• Storing all of them in…
![Page 42: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/42.jpg)
@algolia
Analytics
• Elasticsearch \o/
• … without FTS :)
• but with aggregations
![Page 43: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/43.jpg)
@algolia
Analytics• No FTS
• No source
• Doc values everywhere
• SSD only
• Custom aggregations
(deprecated since ES 1.1.0)
![Page 44: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/44.jpg)
@algolia
Top-k Aggregation• Before
• Linear memory consumption
• Exhaustivity
• After
• Constant memory consumption
• Approximative but enough
![Page 45: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/45.jpg)
@algolia
Building your worldwide infra- Is long and difficult quest - Is a real asset & differentiator
The Future of APIs is Distributed
![Page 46: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/46.jpg)
@algolia
All the details of our architecture are on HighScalability.com
Want to know more?
![Page 47: Algolia - Hosted Search API](https://reader033.fdocuments.in/reader033/viewer/2022042820/55d18bafbb61eb7f6f8b471c/html5/thumbnails/47.jpg)
THANK YOU!
[email protected] @algolia
Build Unique Search ExperiencesWe are hirin
g in SF, NYC and Paris 😊