Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak...
Transcript of Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak...
![Page 1: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/1.jpg)
![Page 2: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/2.jpg)
Summarize Large Text using NLP on NVIDIA P3 Instance
GTC | S9628 | Mar, 2019
![Page 3: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/3.jpg)
“Summarization is the jungle of NLP”
![Page 4: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/4.jpg)
4 © 2019, Amazon Web Services, Inc.
Kristof SchumGlobal Segment Leader
Machine Learning
AWS Partner Network
From consulting to ML PM
Automated Insights
Summarization from Wharton
Teach Summarization at MLU
![Page 5: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/5.jpg)
5 © 2019, Amazon Web Services, Inc.
GPU up Teach Innovate
1 2 3
![Page 6: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/6.jpg)
Why bother?Summarization is not as fundamental and immediately applicable as a feed-
forwarded neural net or XGBoost.
![Page 7: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/7.jpg)
7 © 2019, Amazon Web Services, Inc.
1. Trending
![Page 8: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/8.jpg)
2. Multifaceted
Graphs
NLP
Bayesian RNN
Linear
CNN
LDA
LSA
Clustering
![Page 9: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/9.jpg)
9 © 2019, Amazon Web Services, Inc.
3. Easy
to
innovate
![Page 10: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/10.jpg)
10Slide_Collection_PREMIUM_2013_English_-_March.pptx
Imagine you did not have timeto take notes
![Page 11: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/11.jpg)
11 © 2019, Amazon Web Services, Inc.
Amazon
Transcribe
+
Amazon
Sagemaker
=
Notes
Instantly
![Page 12: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/12.jpg)
12 © 2019, Amazon Web Services, Inc.
Agenda for today
Evolution
Paraphrase
Statistical
Of Automatic Text Summarization
Methods in the field of Natural language generationThat provide new text as summary
Methods that are focused on finding and extracting the most expressive as-is sentences in the text
![Page 13: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/13.jpg)
1. Evolution
![Page 14: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/14.jpg)
14 © 2019, Amazon Web Services, Inc.
“A reductive transformation of source text to summary text through content condensation by selection and/or generalization on what is important in the source.”
![Page 15: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/15.jpg)
15 © 2019, Amazon Web Services, Inc.
Schematic summary processing model
Source text Interpretation
Source representation
Summary representation
Summary text
Transformation
Generation
![Page 16: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/16.jpg)
16 © 2019, Amazon Web Services, Inc.
‘Genres’ of Summary?• Indicative vs. informative
...used for quick categorization vs. content processing.
• Extract vs. abstract...lists fragments of text vs. re-phrases content coherently.
• Generic vs. query-oriented...provides author’s view vs. reflects user’s interest.
• Background vs. just-the-news...assumes reader’s prior knowledge is poor vs. up-to-date.
• Single-document vs. multi-document source...based on one text vs. fuses together many texts.
16
![Page 17: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/17.jpg)
17 © 2019, Amazon Web Services, Inc.
Evolution of methods
1958 1963 1971 1978 1990 1992 1994 1997 2000 2004 2006 2010 2012
Luhn
Vasiliev
1961 1975 1991 1996 2002 2008 20141969 1989 1993 1998 2005 2011
Baxendale
Rush
Pollock &
Zamora
FRUMP Kupic,
Niggemeyer
McKeown
Radev,
(MultiDoc)
Ocelot
(Sent.
Compr)
MEAD (NLP)
Edmundson Pyramid,
Sent.
Fusion
TextRank
LexRank
Fresa
Gensim
TIDES
multilingual
Abstractive
Genest &
Lapalme
Lexical Chains
Cremins
RST,
SummonsTAC, HexTAC
Jones
Sparck-
Citation
Summ
Seq2Seq
Neural
Abstractive
2016-18
![Page 18: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/18.jpg)
2. Statistical
methods
![Page 19: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/19.jpg)
19 © 2019, Amazon Web Services, Inc.
The father of information retrieval
![Page 20: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/20.jpg)
Let’s give it an easy timeDemo
![Page 21: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/21.jpg)
21 Company Confidential21 © 2019, Amazon Web Services, Inc.
5 sentences generated from the article:* It’s that time of year again.
* This conference always hosts a smorgasbord of informative keynotes, exhibitors, and hands-on sessions, on a wide variety of topics.
* The program will include a women-led panel session, women-only DLI sessions, and a networking reception.
* The conference will also focus on up-and-coming fields such as finance, healthcare, and telco.
* The conference continues to expand, with more sessions, more exhibitors, and more emergent topics of discussion (healthcare, telco, finance, etc.
(link)
![Page 22: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/22.jpg)
Let’s give it a hard timeDemo
![Page 23: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/23.jpg)
The Blah Story
11.3M words
17,868 pages
![Page 24: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/24.jpg)
24 Company Confidential24 © 2019, Amazon Web Services, Inc.
Sentences generated from the Lord of The Rings:
* He looked at the great walls, and the towers and brave banners, and the sun in the high sky, and then at the gathering gloom in the East; and he thought of the long fingers of that Shadow: of the ores in the woods and the mountains, the treason of Isengard, the birds of evil eye, and the Black Riders even in the lanes of the Shire -and of the winged terror, the Nazgyl. … [4 more]
![Page 25: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/25.jpg)
25 © 2019, Amazon Web Services, Inc.
A more sophisticated statistical method
Source: Rada – Tarau 2014
![Page 26: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/26.jpg)
26 © 2019, Amazon Web Services, Inc.
An alternative: similarity with CNNs
Source: Zhang, Er, Pratama - 2016
![Page 27: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/27.jpg)
Let’s give TextRank an easy timeDemo
![Page 28: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/28.jpg)
28 Company Confidential28 © 2019, Amazon Web Services, Inc.
(link)
5 sentences generated from the article:- The story holds true for this year’s event (held March 17-21), with NVIDIA promising to shine a spotlight on all the impactful applications of AI, including robotics and autonomous vehicles with a larger keynote area and more exhibitors.
- This year’s conference speaker roster features a who’s who in AI and deep learning, with experts from industry leaders such as Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more.
- NVIDIA’s tech rock star CEO Jensen Huang will be delivering his keynote (no doubt in his signature leather jacket) on Monday afternoon, at the San Jose State event center, which seats 5,000 (2,000 more than last year’s venue).
- NVIDIA says 9 of the world’s top 12 telco companies will be attending and presenting at this year’s GTC, as well as 4 of the top 5 medical research universities and 5 of the top 7 radiology departments.
- NVIDIA promises more Deep Learning Institute (DLI) coverage this year, with six all-day workshops (including developer certification), and over 100 DLI sessions all said and told.
![Page 29: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/29.jpg)
Let’s give Textract a LOTR timeDemo
![Page 30: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/30.jpg)
30 Company Confidential30 © 2019, Amazon Web Services, Inc.
Sentences generated from the Lord of The Rings:
The Hobbits named it the Shire, as the region of the authority of their Thain, and a district of well-ordered business; and there in that pleasant comer of the world they plied their well-ordered business of living, and they heeded less and less the world outside where dark things moved, until they came to think that peace and plenty were the rule in Middle-earth and the right of all sensible folk.
… 4 more
![Page 31: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/31.jpg)
31 © 2019, Amazon Web Services, Inc.
No deep learning, no need for P3
20secs
2xlarge 2xlargem5 p3
21secs
![Page 32: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/32.jpg)
3. Paraphrasing
method
![Page 33: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/33.jpg)
33 © 2019, Amazon Web Services, Inc. 33
Deep learning to the rescue - RNNs
![Page 34: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/34.jpg)
34 © 2019, Amazon Web Services, Inc. 34
0
5
10
15
20
25
30
2013 2104 2015 2016
Phrase-based SMT Syntax-based SMT Neural MT
Source: Meta Forum 2016 - Sennrich
![Page 35: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/35.jpg)
35 © 2019, Amazon Web Services, Inc. 35
![Page 36: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/36.jpg)
36 © 2019, Amazon Web Services, Inc.
![Page 37: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/37.jpg)
37 © 2019, Amazon Web Services, Inc. 37
![Page 38: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/38.jpg)
38 © 2019, Amazon Web Services, Inc. 38
![Page 39: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/39.jpg)
39 © 2019, Amazon Web Services, Inc. 39
![Page 40: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/40.jpg)
40 © 2019, Amazon Web Services, Inc. 40
![Page 41: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/41.jpg)
41 © 2019, Amazon Web Services, Inc. 41
![Page 42: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/42.jpg)
42 © 2019, Amazon Web Services, Inc. 42
![Page 43: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/43.jpg)
43 © 2019, Amazon Web Services, Inc. 43
![Page 44: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/44.jpg)
44 © 2019, Amazon Web Services, Inc. 44
![Page 45: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/45.jpg)
45 © 2019, Amazon Web Services, Inc. 45
![Page 46: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/46.jpg)
46 © 2019, Amazon Web Services, Inc. 46
![Page 47: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/47.jpg)
47 © 2019, Amazon Web Services, Inc. 47
![Page 48: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/48.jpg)
48 © 2019, Amazon Web Services, Inc. 48
![Page 49: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/49.jpg)
49 © 2019, Amazon Web Services, Inc. 49
![Page 50: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/50.jpg)
50 © 2019, Amazon Web Services, Inc.
RNN with attention mechanisms
Source: See, Liu, Manning - 2017
![Page 51: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/51.jpg)
51 © 2019, Amazon Web Services, Inc. 51
![Page 52: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/52.jpg)
52 Company Confidential52 © 2019, Amazon Web Services, Inc.
“Abstracts” from the model:
TEXT:“great taffy at a great price. there was a wide assortment of yummy taffy. delivery was very quick. if your a taffy lover, this is a deal.”
PREDICTED SUMMARY:nice taffy!ACTUAL SUMMARY:great taffy!
![Page 53: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/53.jpg)
53 © 2019, Amazon Web Services, Inc.
The power of P3 Instance on 50K items
187mins
2xlarge 2xlargem5 p3
59 mins
![Page 54: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/54.jpg)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Trademark
M L F R A M E W O R K S &
I N F R A S T R U C T U R E
A I S E R V I C E S
R E K O G N I T I O N
I M A G E
P O L L Y T R A N S C R I B E T R A N S L A T E C O M P R E H E N D L E XR E K O G N I T I O N
V I D E O
Vis ion Speech Language Chatbots
A M A Z O N
S A G E M A K E R
B U I L D T R A I N
F O R E C A S T
Forecast ing
T E X T R A C T P E R S O N A L I Z E
Recommendat ions
D E P L O Y
Pre-bui l t a lgor i thms & notebooks
Data label ing (G R O U N D T R U T H )
One-c l ick model t ra in ing & tuning
Opt im izat ion (N E O )
One-c l ick deployment & host ing
M L S E R V I C E S
F r a m e w o r k s I n t e r f a c e s I n f r a s t r u c t u r e
E C 2 P 3
& P 3 N
E C 2 C 5 F P G A s G R E E N G R A S S E L A S T I C
I N F E R E N C E
Reinforcement learningA lgor i thms & models ( A W S M A R K E T P L A C E
F O R M A C H I N E L E A R N I N G )
![Page 55: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/55.jpg)
Thank you for your interest.
![Page 56: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/56.jpg)
56 © 2019, Amazon Web Services, Inc.
The Goal: Pre-train + Finetune in NLP
Previously, context representation was either one directional, or
only token level (missing the bigger picture)
![Page 57: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/57.jpg)
57 © 2019, Amazon Web Services, Inc.
2018 Major NLP Advances• Transformer – Attention Is All You Need
Vaswani et al. (Google) technically 2017
• ULMFiT – Universal Language Model Fine-tuning for Text Classification
Howard & Ruder (fast.ai, AYLIEN)
• ELMo – Deep contextualized word representations
Peters et al. (AI2, UW)
• GPT Transformer – Improving Language Understanding by Generative Pre-Training
Radford et al. (OpenAI)
• BERT – Pre-training of Deep Bidirectional Transformers for Language Understanding
Devlin et al. (Google)
Among many more…
![Page 58: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/58.jpg)
58 © 2019, Amazon Web Services, Inc.
Transformer – Attention Is All You Need
• No recurrent layers (RNN/LSTM);
allows parallelization
• Transformer: Basic building block
comprises of Attention and FFN
layers
• Both Encoder and Decoders
comprised of stacked Transformers.
• Can be trained significantly faster.
![Page 59: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/59.jpg)
59 © 2019, Amazon Web Services, Inc.
Self-AttentionIntuition
Credit: https://jalammar.github.io/illustrated-transformer/
![Page 60: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/60.jpg)
60 © 2019, Amazon Web Services, Inc.
Mu
lti H
ea
d A
tte
ntio
n
Pa
ralle
l Atte
ntio
n L
aye
rs
Credit: https://jalammar.github.io/illustrated-transformer/
![Page 61: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/61.jpg)
61 © 2019, Amazon Web Services, Inc.
𝑭𝑭𝑵 𝒙 = 𝑹𝒆𝒍𝒖 𝒙,𝑾𝟏 𝑾𝟐 + 𝒃𝑾𝟏 ∈ ℛ𝒅𝒎𝒐𝒅𝒆𝒍×𝟐𝟎𝟒𝟖,
𝑾𝟐 ∈ ℛ𝟐𝟎𝟒𝟖×𝒅𝒎𝒐𝒅𝒆𝒍
Y = LayerNorm(u + FFN(u))
Y = LayerNorm(u +
Multi−Head−Attn(u))
Comprises of 8 Self-Attention
layers
Encoder
• Constant layer dimension: 𝑑𝑚𝑜𝑑𝑒𝑙 = 512
• Employs dropout to every sub-layer
before norm and embedding layers
![Page 62: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/62.jpg)
62 © 2019, Amazon Web Services, Inc.
ULMFiT – Universal Language Model Fine-tuning for Text Classification
• Key takeaways:
Effective transfer learning for NLP (using LSTMs)
Introduces novel language model fine-tuning techniques
Helps solve NLP problems with less data
![Page 63: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/63.jpg)
63 © 2019, Amazon Web Services, Inc.
ELMo – Embeddings from language models
• Key takeaways:
Word embedding values
conditioned on context
- Handles polysemy
Trained using BiLSTM on next-
word-prediction task
![Page 64: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/64.jpg)
64 © 2019, Amazon Web Services, Inc.
ELMo – Deep contextualized word representations
![Page 65: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/65.jpg)
65 © 2019, Amazon Web Services, Inc.
GPT Transformer – Generative Pre-Training
• Setting the stage for multi-task NLP
• Key takeaways:
Combining unsupervised pre-training with Transformers
- Building upon ULMFiT, ELMo
The OpenAI Transformer
- Only Transformer decoders, trained on prediction and
classification
- No encoder-decoder attention sublayer
- Remember: Transformer decoder masks future tokens
- Note: Only a forward language model, not bidirectional
SOTA performance on GLUE benchmark
- Shows Transformer is flexible and robust
![Page 66: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/66.jpg)
66 © 2019, Amazon Web Services, Inc.
GPT Transformer – Generative Pre-Training
• Multitasking trick: Input transformations for various tasks
![Page 67: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/67.jpg)
BERT:Bidirectional Encoder
Representations from Transformers
![Page 68: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/68.jpg)
68 © 2019, Amazon Web Services, Inc.
Secret Sauce #1: Masked LM• Before feeding word
sequences into BERT,
15% of the words in each
sequence are replaced
with a [MASK] token
• The model then attempts
to predict the original
value of the masked
words, based on the
context provided by the
other, non-masked,
words in the sequence
![Page 69: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/69.jpg)
69 © 2019, Amazon Web Services, Inc.
Secret Sauce #2: Next Sent. Pred.
![Page 70: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/70.jpg)
70 © 2019, Amazon Web Services, Inc.
Results: Surpassing Humans
86.3
+27.1
bps
59.2
Sentence pair
completion (SWAG)
85.0
Best Legacy AI Human BERT
60.5
+15.1
bps
45.4
Single Sentence
Classif. (CoLa)
93.2
+4.7
bps
88.5
Question Answer
(SQuAD)
91.772.1
+1.8
bps
70.3
Semantic
Equivalence (QQP)
![Page 71: Summarize Large Text using - developer.download.nvidia.com · Amazon, Alibaba, Google, NASA, Oak Ridge National Labs, IBM, Verizon, Volvo, PayPal, and many, many more. - NVIDIA’s](https://reader034.fdocuments.in/reader034/viewer/2022050718/5e16dc1e1ef77e097058626f/html5/thumbnails/71.jpg)
Thank you!