Week 6(10.7): The TINY sample language and it ’ s compiler The TINY + extension of TINY
Tiny Google Projects
-
Upload
ostap-andrusiv -
Category
Technology
-
view
1.934 -
download
0
description
Transcript of Tiny Google Projects
![Page 1: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/1.jpg)
:)
![Page 2: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/2.jpg)
![Page 3: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/3.jpg)
tiny :projects
![Page 4: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/4.jpg)
![Page 5: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/5.jpg)
![Page 6: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/6.jpg)
مرحبا العالم
![Page 7: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/7.jpg)
Tesseract OCR
1985 2006
HP Google
![Page 8: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/8.jpg)
Tesseract OCR
2006 2011
TIFF *
![Page 9: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/9.jpg)
Tesseract OCR
2009 2010
Text layout
![Page 10: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/10.jpg)
Tesseract OCR
2007 2011
6 33
![Page 11: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/11.jpg)
Tesseract OCR
Arabic, English, Bulgarian, Catalan, Czech, Chinese (Simplified and Traditional), Danish
(standard and Fraktur script), German, Greek, Finnish, French, Hebrew, Croatian, Hungarian, Indonesian, Italian, Japanese, Korean, Latvian,
Lithuanian, Dutch, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak (standard and Fraktur script), Slovenian, Spanish, Serbian, Swedish, Tagalog, Thai,
Turkish, Ukrainian and Vietnamese
![Page 12: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/12.jpg)
Tesseract OCR
Officially supported:
Probably runs on:
![Page 13: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/13.jpg)
Image processing
![Page 14: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/14.jpg)
![Page 15: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/15.jpg)
![Page 16: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/16.jpg)
![Page 17: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/17.jpg)
Google Refine
![Page 18: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/18.jpg)
Runs on:
![Page 19: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/19.jpg)
Runs in:
![Page 20: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/20.jpg)
Major features:
Import from anywhereFacetingClusteringSplit crate custom columnsGREL transformationsExport/etc
![Page 21: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/21.jpg)
![Page 22: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/22.jpg)
google protocol buffersgoogle protocol buffers
message Person { required int32 id = 1; required string name = 2; optional string email = 3;}
Person person;person.set_id(123);person.set_name("Bob");person.set_email("[email protected]");
fstream out("person.pb", ios::out ...person.SerializeToOstream(&out);out.close();
>
![Page 23: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/23.jpg)
512 bytes / tweet 340,000,000 tweets / day (2012)7,253,333,333 bytes / hour 2,014,814 bytes / second 1,921 Mbytes / second 15,371 Mbits / second
8 Tbytes / day (2011)
Google: ~ 377M searches/day
![Page 24: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/24.jpg)
=+
![Page 25: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/25.jpg)
=+
![Page 26: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/26.jpg)
=+
![Page 27: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/27.jpg)
=+>
![Page 28: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/28.jpg)
=+>
![Page 29: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/29.jpg)
=+>
MapReduce
?
![Page 30: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/30.jpg)
![Page 31: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/31.jpg)
snappyhttp://code.google.com/p/snappy/
![Page 32: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/32.jpg)
Free and BSDRobust
snappy
Fast Stable
![Page 33: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/33.jpg)
Size
lzjb 2010
lzo 2.04 1x
fastlz 0.1 -1
fastlz 0.1 -2
lzf 3.6 vf
lzf 3.6 uf
lzrw1
lzrw1-a
lzrw2
lzrw3
lzrw3-a
snappy 1.0
quicklz 1.5.0 -1
quicklz 1.5.0 -2
0
10
20
30
40
50
60
70
80
compression ratio (%) (less is better)
![Page 34: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/34.jpg)
Data types
plain text html jpeg0
1
2
3
4
5
6
snappyzlib
com
pres
sion
ratio
![Page 35: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/35.jpg)
Size
from 20% to 100% bigger
:(
...not for amazon glacier
![Page 36: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/36.jpg)
Speed
lzjb 2010
lzo 2.04 1x
fastlz 0.1 -1
fastlz 0.1 -2
lzf 3.6 vf
lzf 3.6 uf
lzrw1
lzrw1-a
lzrw2
lzrw3
lzrw3-a
snappy 1.0
quicklz 1.5.0 -1
quicklz 1.5.0 -2
0
50
100
150
200
250
Compression (MB/s) (more is better)
![Page 37: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/37.jpg)
Speed
lzjb 2010
lzo 2.04 1x
fastlz 0.1 -1
fastlz 0.1 -2
lzf 3.6 vf
lzf 3.6 uf
lzrw1
lzrw1-a
lzrw2
lzrw3
lzrw3-a
snappy 1.0
quicklz 1.5.0 -1
quicklz 1.5.0 -2
0
50
100
150
200
250
300
350
400
450
500
Decompression (MB/s) (more is better)
![Page 38: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/38.jpg)
On 1 core of 64-bit Core i7 processor:
• Compression: 250MB/s
• Decompression: 500MB/s
:P
![Page 39: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/39.jpg)
Portable, but...
![Page 40: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/40.jpg)
Portable, but primarily optimizedfor 64-bit x86-compatible processors
![Page 41: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/41.jpg)
Used:
BigTableMapReduceGoogle RPC
Hadoop
![Page 42: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/42.jpg)
Bindings:
![Page 43: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/43.jpg)
@TarasRoshko
HTTP headers here:
http://code.google.com/p/snappy/source/browse/trunk/framing_for
mat.txt
![Page 44: Tiny Google Projects](https://reader035.fdocuments.in/reader035/viewer/2022062709/558e81491a28aba50b8b47b7/html5/thumbnails/44.jpg)
QA? Ostap Andrusiv
Software EngineerEleks software@p1f