AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

59
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Jed Sundwall, Open Data Global Lead November 29, 2016 Earth on AWS Working with Planetary-scale Data in the Cloud

Transcript of AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Page 1: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Jed Sundwall, Open Data Global Lead

November 29, 2016

Earth on AWSWorking with Planetary-scale Data in the Cloud

Page 2: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

What to Expect from the Session

• About open data on AWS

• Advantages of Amazon S3 for sharing data

• How E&J Gallo and DigitalGlobe use AWS to work with

geospatial data

Page 3: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Why Does AWS Care About Open Data?

Sharing data on AWS makes it accessible

to a large and growing community of

researchers, entrepreneurs, and

enterprises who use the AWS cloud.

Page 4: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

“…data must be organized, well-

documented, consistently formatted, and

error free. Cleaning the data is often the

most taxing part of data science, and is

frequently 80% of the work.”

- Data Driven by DJ Patil and Hilary Mason

Undifferentiated Heavy Lifting

Page 5: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Traditional Data Acquisition

Data Acquisition in the Cloud

Page 6: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

When data is shared in the cloud, anyone

can analyze any volume of data without

needing to download or store it themselves.

Page 7: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Landsat on AWS

Page 8: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Landsat

Landsat 8 satellite Raster data

Page 9: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

RGB

Visible light

Infrared

Vegetation

Shortwave infrared

Urban areas

Wellington, New Zealand

Page 10: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Landsat on AWS

Amazon

EC2

s3://landsat-pds

.tarUSGS

.tiff

Page 11: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Landsat on AWS

Graph by Drew Bollinger (@drewbo19) at Development Seed

Page 12: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)
Page 13: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

landsatonaws.comServerless browser interface to Landsat on AWS

Page 14: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

New open source tools

• GDAL – http://www.gdal.org/

• Rasterio – https://mapbox.github.io/rasterio/

• sat-utils suite – https://github.com/sat-utils

Page 15: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Elevation

Models

Aerial

Imagery

Climate

Models

Satellite

Imagery

High-resolution

Radar

aws.amazon.com/earth

Page 16: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

AWS Cloud Credits for Research

provide promotional AWS cloud credits

for anyone to conduct research using

Earth Observation data.

aws.amazon.com/earth/research-credits

Page 17: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

John Webb, Manager, Information Technology , E & J Gallo Winery

Wine and Grape Supply Data LakeE & J Gallo

Page 18: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

E&J GALLO WINERY

Background

Page 19: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

E&J Gallo Winery

Largest Winery in the world

• Established in 1933 and headquartered in Modesto,

California, E. & J. Gallo Winery remains a privately-held

and ever-growing company that employs 6,000 people

worldwide.

• Largest land owner in the State of California

Products and Distribution

• With products available in more than 90 countries, E. &

J. Gallo Winery is the largest exporter of California wine,

and imports wines from Argentina, Italy, New Zealand

and Spain

• WE hold 90 brands and include table and sparkling

wines, beverage products, dessert wines and distilled

spirits

Page 20: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Gallo Portfolio

Popular Premium Ultra

Premium

Page 21: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

WINEGROWING

Perennial v. Row Crop

Page 22: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Wine and Grape Supply

Grower Relations

• Manage external grape purchases

• Inform growers on best management practices

Gallo Vineyards Inc.

• Gallo owned vineyards

• Irrigation management

• Yield and Quality estimation

• Best management practices for vineyard management

• Operations planning and scheduling

Viticulture Chemistry and Enology

• Internal research division

• Quality and yield enhancement

• Variable rate irrigation

Winemaking

Page 23: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Complexities of Wine Growing

Not a Biomass Product

• Manage more than just nitrogen and water

• A number of activities go against a given vine

Perennial Crop

• Average vineyard lifecycle of 24 years

• Vine management is an on-going activity

• Canopy Management

• Shoot Thinning

• Vine Balancing

• Trellis Management

• Nutrients

• Pesticides

High Value Crop Management

• Direct correlation between ranch management plan and

quality

• Yield impact associated with canopy management

activities

Page 24: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

DATA DRIVEN INSIGHTS

Data Lake Modeling

Page 25: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Why Data Matters

We operate in an ‘Activity’ based agricultural environment

from which we can drive insights

• Events or Transactions help us manage our business

• These events are both structured and unstructured

Structured Events such as…

• Tons of grapes delivered to a specific winery

• Distribution of soil across a vineyard

• Ranch management transactions applied (i.e. shoot

thinning, leafing, dropping fruit etc.)

Unstructured Events such as…

• Imagery

• Soil Moisture Nodes

• Yield Monitors

• Mechanized Asset Control

Page 26: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Data Driven Value

Having a robust data and analytics platform allows us to

drive insights

• Grape to Bottle data management

• Improve Quality

• Increase Yield

• Detect and prevent early on-set disease

• Maintain a competitive supply position through predictive

yield estimation

Page 27: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Data Lake Model

EC2

Elastic Load Balancing

S3

Redshift

SQL on DB

Instance

EMR

Customer Facing

Product Layer

EDW

Reports

Data Processing,

Analysis, R&D Layer

Data Collection

Layer

Dashboard

Insights

D

a

t

a

L

a

y

e

r

Page 28: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Shay Har-Noy, VP & General Manager, Platform, DigitalGlobe

Getting Data Out of JailDigitalGlobe’s Geospatial Big Data Platform

Page 29: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

@iheartcrowds

Page 30: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

3,500,000 km2 collected EVERY DAY

13,200,000,000,000 pixels

Page 31: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Last 3 Days

Page 32: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Last 14 Days

Page 33: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Last 3 Months

Page 34: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

With frequent coverage of high interest areas

Page 35: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Goal: Large scale information extraction

Page 36: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Three trends working in our favor

1.

Page 37: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Three trends working in our favor

2. hidden layer 1 hidden layer 2 hidden layer 3

Page 38: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Three trends working in our favor

3.

Page 39: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Open up Geo to world of innovators working on Deep Learning

Solve a new class of problems

to impact business, government, and people

Page 40: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

strategic approach: Leverage the ecosystem

Crowdsourcing

Page 41: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

– Data out of jail!

Storage Compute

GBD Partner Tools Tools

MapsAPI

Catalog API

Workflow API

Page 42: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Scale humans until machines get smarter

Use humans to create training data

Crowdsourcing

Page 43: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Training and testing set to allow benchmarking and iteration

Page 44: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

We have common frameworks to accelerate prototyping

We need a common data set to evaluate performance

Page 45: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

- 50cm WV-2 image w/8 spectral bands

- Each image 200m2

- Covering Rio de Janeiro, Brazil

- 7000 Geotiff images

Available for your computing pleasure on

Page 46: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Rio Olympics – Use Case

• Governments around the world were preparing for the upcoming Olympics in Rio De Janeiro

• Thorough understanding of the security and safety of the athletes is front of mind

• Knowledge of high risk areas, traffic patterns, line-of-sight and overall structure distribution is critical to ensure a secure games

Page 47: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)
Page 48: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)
Page 49: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)
Page 50: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)
Page 51: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)
Page 52: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)
Page 53: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)
Page 54: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)
Page 55: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

What you can expect

1) Satellite imagery analytics will be way more common in your industry

2) Computer vision tasks are going to change the way we see the world 3) We are going to learn a lot

Page 56: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Try it yourself

aws.amazon.com/public-data-sets/spacenet/

developer.digitalglobe.com

Page 57: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Thank you!

Page 58: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Remember to complete

your evaluations!

Page 59: AWS re:Invent 2016: Earth on AWS—Next-Generation Open Data Platforms (STG203)

Related Sessions