Using BigQuery as a main Big Data solution
-
Upload
nikolay-novozhilov -
Category
Data & Analytics
-
view
86 -
download
1
Transcript of Using BigQuery as a main Big Data solution
![Page 1: Using BigQuery as a main Big Data solution](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c29088bb61eb612b8b45ee/html5/thumbnails/1.jpg)
Nikolay Novozhilov Wego.com
Using BigQuery as a main Big Data solution
![Page 2: Using BigQuery as a main Big Data solution](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c29088bb61eb612b8b45ee/html5/thumbnails/2.jpg)
About Wego
Wego.com is Asia Pacific and the Middle East’s leading flight/hotel metasearch engine used by millions of travelers.
Wego was founded in 2005 in Singapore
![Page 3: Using BigQuery as a main Big Data solution](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c29088bb61eb612b8b45ee/html5/thumbnails/3.jpg)
Introducing BigQuery
Service for interactive analysis of massive datasets (TBs)
Query billions of rows: seconds to write, seconds to return
Uses a SQL-style query syntax
It's a service, accessed by a RESTful API
Pay only for what you use
Based on internal Google tool - Dremel
Column oriented, append only…
![Page 4: Using BigQuery as a main Big Data solution](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c29088bb61eb612b8b45ee/html5/thumbnails/4.jpg)
Data architecture in Wego
...
![Page 5: Using BigQuery as a main Big Data solution](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c29088bb61eb612b8b45ee/html5/thumbnails/5.jpg)
Why did we do it?
MySQL
“Zoo”
BigQuery
![Page 6: Using BigQuery as a main Big Data solution](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c29088bb61eb612b8b45ee/html5/thumbnails/6.jpg)
Why Hadoop is more popular?
![Page 7: Using BigQuery as a main Big Data solution](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c29088bb61eb612b8b45ee/html5/thumbnails/7.jpg)
My collection of concernsYour data goes to cloud
Not open-source, Google can stop the service
“Strange” pricing model
Hadoop is trending, has bigger community
Append only database
???
![Page 8: Using BigQuery as a main Big Data solution](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c29088bb61eb612b8b45ee/html5/thumbnails/8.jpg)
Costs: storage + cost per query
Same fallacy again: “I want to launch a mom@pop – let’s buy a
building” “I want to build a site – let’s by servers” “I want big data – let’s build a data-
warehouse”
Usual concerns: No realistic estimate upfront “Fear of running a query”
![Page 9: Using BigQuery as a main Big Data solution](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c29088bb61eb612b8b45ee/html5/thumbnails/9.jpg)
StackOverflow support
53 minutes
!
![Page 10: Using BigQuery as a main Big Data solution](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c29088bb61eb612b8b45ee/html5/thumbnails/10.jpg)
Append only…Slowly changing dimensions: daily re-load from MySQL daily upload from MySQL, keeping history
Absolutely necessary updates: do you really need it? BigQuery allows to save query to initial
table:
Your tabl
eQuery
![Page 11: Using BigQuery as a main Big Data solution](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c29088bb61eb612b8b45ee/html5/thumbnails/11.jpg)
Actually useful - “Discovery mode”
![Page 12: Using BigQuery as a main Big Data solution](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c29088bb61eb612b8b45ee/html5/thumbnails/12.jpg)
Actually useful
Huge joins
REGEXT_MATCH(), …
Rich SQL - window functions
Nested data
![Page 13: Using BigQuery as a main Big Data solution](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c29088bb61eb612b8b45ee/html5/thumbnails/13.jpg)
My answer
![Page 14: Using BigQuery as a main Big Data solution](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c29088bb61eb612b8b45ee/html5/thumbnails/14.jpg)
What is Big Data revolution?
There is no difference between big data and small data anymore
![Page 16: Using BigQuery as a main Big Data solution](https://reader030.fdocuments.in/reader030/viewer/2022032504/55c29088bb61eb612b8b45ee/html5/thumbnails/16.jpg)
“Yes, Sir, I tired to build an ROI case for our BI project - but I couldn’t
access any reliable data!”TimoElliott.com