Weather scraper for your data warehouse
-
Upload
fru-n -
Category
Technology
-
view
457 -
download
1
Transcript of Weather scraper for your data warehouse
1
Weather Scraper Get weather Information for your data warehouse and reporting/analytical needs.
Create SQL Table to Store the Weather Information: CREATE TABLE [dbo].[Weather]( [ID] [int] IDENTITY(1,1) NOT NULL, [InsertDate] [varchar](255) NULL, [ZipCode] [varchar](255) NULL, [CityID] [varchar](255) NULL, [CityName] [varchar](255) NULL, [CoordLong] [varchar](255) NULL, [CoordLat] [varchar](255) NULL, [Country] [varchar](255) NULL, [SunriseStart] [varchar](255) NULL, [SunriseSet] [varchar](255) NULL, [TemperatureAvg] [varchar](255) NULL, [TemperatureMin] [varchar](255) NULL, [TemperatureMax] [varchar](255) NULL, [TemperatureUnit] [varchar](255) NULL, [HumidityValue] [varchar](255) NULL, [HumidityUnit] [varchar](255) NULL, [PressureValue] [varchar](255) NULL, [PressureUnit] [varchar](255) NULL, [WindSpeedValue] [varchar](255) NULL, [WindSpeedName] [varchar](255) NULL, [WindDirectionValue] [varchar](255) NULL, [WindDirectionCode] [varchar](255) NULL, [WindDirectionName] [varchar](255) NULL, [CloudValue] [varchar](255) NULL, [CloudName] [varchar](255) NULL, [PrecipitationMode] [varchar](255) NULL, [WeatherNumber] [varchar](255) NULL, [WeatherValue] [varchar](255) NULL, [WeatherIcon] [varchar](255) NULL, [LastUpdateValue] [varchar](255) NULL, PRIMARY KEY CLUSTERED ( [ID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO
After Table is created, we will use the http://api.openweathermap.org RESTful API to access and store
the Weather Information.
You can see sample weather information returned from query by accessing this link:
http://api.openweathermap.org/data/2.5/weather?q=55441&mode=xml
2
SSIS Package The package is very simple.
1. Get List all all the ZipCodes
2. Loop through each ZipCode and Get Current Weather Information For.
Details:
3
SELECT DISTINCT [ZipCode] FROM [dbo].[ZipCodes] order by ZipCode Desc
4
Store Results in Variable.
Loop through each Individual ZipCode in the ForEachLoop Container.
5
Map the Individual Zip Codes to a ZipCode Variable.
6
Pass the “loaded” ZipCode variable in the ForEachLoop container to the script task so as to pull the
weather information for particular ZipCode.
7
Edit the Script Task. To see the Code:
This is the Key:
Build your URL using the ZipCode in order to get the result
I specify USA in the string to return only US results. There is much documentation on the
openweathermap website on how to search for specific data.
http://api.openweathermap.org/API#search_city
var url = @"http://api.openweathermap.org/data/2.5/weather?q="+ZipCode+",USA&mode=xml";
The two methods in my implementation are the main() method and the SaveWeatherData()
8
MainMethod builds URL, makes call to API, and parses out the resulting XML.
SaveWeatherData Method, is called by main method. It takes parameter values and persist them in the
database table.
9
Each time the Script tasked is call, the weather data for that ZipCode be returned and inserted into your
table.
The execution looks like this.
Your Results should look like this.
10
Now, you have detailed weather information with date and zip codes at your disposal. You can tie this
with location information in your database or data warehouse to do extensive querying. E.g.
How does rain affect my sales by region
How does humidity affect sales
How does cloud cover affect sales
How does weather affect tips
How does weather affect Employee productivity
The job can be scheduled to run hourly, daily, weekly or whatever frequency you want.
The sky (pun intended) is virtually the limit on this.
Good Luck.
About: Fru Louis is a developer, blogger and all around technology enthusiasts. Fru is also a contributor and principal of the BIWizzard blog. He writes and stays abreast with the latest innovative ideas, news, and trends. Have a tip, comment or critic? Email him at fru.louis.gmail.com.