Weather scraper for your data warehouse

10
1 Weather Scraper Get weather Information for your data warehouse and reporting/analytical needs. Create SQL Table to Store the Weather Information: CREATE TABLE [dbo].[Weather]( [ID] [int] IDENTITY(1,1) NOT NULL, [InsertDate] [varchar](255) NULL, [ZipCode] [varchar](255) NULL, [CityID] [varchar](255) NULL, [CityName] [varchar](255) NULL, [CoordLong] [varchar](255) NULL, [CoordLat] [varchar](255) NULL, [Country] [varchar](255) NULL, [SunriseStart] [varchar](255) NULL, [SunriseSet] [varchar](255) NULL, [TemperatureAvg] [varchar](255) NULL, [TemperatureMin] [varchar](255) NULL, [TemperatureMax] [varchar](255) NULL, [TemperatureUnit] [varchar](255) NULL, [HumidityValue] [varchar](255) NULL, [HumidityUnit] [varchar](255) NULL, [PressureValue] [varchar](255) NULL, [PressureUnit] [varchar](255) NULL, [WindSpeedValue] [varchar](255) NULL, [WindSpeedName] [varchar](255) NULL, [WindDirectionValue] [varchar](255) NULL, [WindDirectionCode] [varchar](255) NULL, [WindDirectionName] [varchar](255) NULL, [CloudValue] [varchar](255) NULL, [CloudName] [varchar](255) NULL, [PrecipitationMode] [varchar](255) NULL, [WeatherNumber] [varchar](255) NULL, [WeatherValue] [varchar](255) NULL, [WeatherIcon] [varchar](255) NULL, [LastUpdateValue] [varchar](255) NULL, PRIMARY KEY CLUSTERED ( [ID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO After Table is created, we will use the http://api.openweathermap.org RESTful API to access and store the Weather Information. You can see sample weather information returned from query by accessing this link: http://api.openweathermap.org/data/2.5/weather?q=55441&mode=xml

Transcript of Weather scraper for your data warehouse

Page 1: Weather scraper for your data warehouse

1

Weather Scraper Get weather Information for your data warehouse and reporting/analytical needs.

Create SQL Table to Store the Weather Information: CREATE TABLE [dbo].[Weather]( [ID] [int] IDENTITY(1,1) NOT NULL, [InsertDate] [varchar](255) NULL, [ZipCode] [varchar](255) NULL, [CityID] [varchar](255) NULL, [CityName] [varchar](255) NULL, [CoordLong] [varchar](255) NULL, [CoordLat] [varchar](255) NULL, [Country] [varchar](255) NULL, [SunriseStart] [varchar](255) NULL, [SunriseSet] [varchar](255) NULL, [TemperatureAvg] [varchar](255) NULL, [TemperatureMin] [varchar](255) NULL, [TemperatureMax] [varchar](255) NULL, [TemperatureUnit] [varchar](255) NULL, [HumidityValue] [varchar](255) NULL, [HumidityUnit] [varchar](255) NULL, [PressureValue] [varchar](255) NULL, [PressureUnit] [varchar](255) NULL, [WindSpeedValue] [varchar](255) NULL, [WindSpeedName] [varchar](255) NULL, [WindDirectionValue] [varchar](255) NULL, [WindDirectionCode] [varchar](255) NULL, [WindDirectionName] [varchar](255) NULL, [CloudValue] [varchar](255) NULL, [CloudName] [varchar](255) NULL, [PrecipitationMode] [varchar](255) NULL, [WeatherNumber] [varchar](255) NULL, [WeatherValue] [varchar](255) NULL, [WeatherIcon] [varchar](255) NULL, [LastUpdateValue] [varchar](255) NULL, PRIMARY KEY CLUSTERED ( [ID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO

After Table is created, we will use the http://api.openweathermap.org RESTful API to access and store

the Weather Information.

You can see sample weather information returned from query by accessing this link:

http://api.openweathermap.org/data/2.5/weather?q=55441&mode=xml

Page 2: Weather scraper for your data warehouse

2

SSIS Package The package is very simple.

1. Get List all all the ZipCodes

2. Loop through each ZipCode and Get Current Weather Information For.

Details:

Page 3: Weather scraper for your data warehouse

3

SELECT DISTINCT [ZipCode] FROM [dbo].[ZipCodes] order by ZipCode Desc

Page 4: Weather scraper for your data warehouse

4

Store Results in Variable.

Loop through each Individual ZipCode in the ForEachLoop Container.

Page 5: Weather scraper for your data warehouse

5

Map the Individual Zip Codes to a ZipCode Variable.

Page 6: Weather scraper for your data warehouse

6

Pass the “loaded” ZipCode variable in the ForEachLoop container to the script task so as to pull the

weather information for particular ZipCode.

Page 7: Weather scraper for your data warehouse

7

Edit the Script Task. To see the Code:

This is the Key:

Build your URL using the ZipCode in order to get the result

I specify USA in the string to return only US results. There is much documentation on the

openweathermap website on how to search for specific data.

http://api.openweathermap.org/API#search_city

var url = @"http://api.openweathermap.org/data/2.5/weather?q="+ZipCode+",USA&mode=xml";

The two methods in my implementation are the main() method and the SaveWeatherData()

Page 8: Weather scraper for your data warehouse

8

MainMethod builds URL, makes call to API, and parses out the resulting XML.

SaveWeatherData Method, is called by main method. It takes parameter values and persist them in the

database table.

Page 9: Weather scraper for your data warehouse

9

Each time the Script tasked is call, the weather data for that ZipCode be returned and inserted into your

table.

The execution looks like this.

Your Results should look like this.

Page 10: Weather scraper for your data warehouse

10

Now, you have detailed weather information with date and zip codes at your disposal. You can tie this

with location information in your database or data warehouse to do extensive querying. E.g.

How does rain affect my sales by region

How does humidity affect sales

How does cloud cover affect sales

How does weather affect tips

How does weather affect Employee productivity

The job can be scheduled to run hourly, daily, weekly or whatever frequency you want.

The sky (pun intended) is virtually the limit on this.

Good Luck.

About: Fru Louis is a developer, blogger and all around technology enthusiasts. Fru is also a contributor and principal of the BIWizzard blog. He writes and stays abreast with the latest innovative ideas, news, and trends. Have a tip, comment or critic? Email him at fru.louis.gmail.com.