Massively Parallel Calculation of Catchment Areas in Retail...– Use Oracle Spaal’s Geocoder to...
Transcript of Massively Parallel Calculation of Catchment Areas in Retail...– Use Oracle Spaal’s Geocoder to...
!"#$%&'()*+*,-./*0%1234*1567"%*&)8*193&1)48:*;33*%&'()8*%484%<46:**=*
!"#$%&'(%)(%*+('(%,(*%-%!"#$%&'(#)*+'")*(",-./.)01+)2'#"/,)
*,3'+#)4150+/(5)6+"$,')
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
SafeHarborStatement
ThefollowingisintendedtooutlineourgeneralproductdirecFon.ItisintendedforinformaFonpurposesonly,andmaynotbeincorporatedintoanycontract.Itisnotacommitmenttodeliveranymaterial,code,orfuncFonality,andshouldnotberelieduponinmakingpurchasingdecisions.Thedevelopment,release,andFmingofanyfeaturesorfuncFonalitydescribedforOracle’sproductsremainsatthesolediscreFonofOracle.
2
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
Agenda• Thebusinessproblem
• Thetechnologyproblem
• Geocoding• NetworkDataModel
• PuPngitalltogether
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
ApplicaFonScenario• Hundredsofstores• Millionsofcustomers
• Runamassmailingcampaign
• Campaigntargetedtocustomersofeachstore
• IdenFfythe“CatchmentArea”ofeachstore:– Theareafromwherecustomerscanreachastoreinlessthan30minutes.
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
TheProblem…• Findallcustomersthatarewithinasetdistancefromeachstore
• NeedtocomputeN(stores)xM(customers)distances!– For1000storesand10millioncustomers,thatmeans10billioncalculaFons!
• StraightlinecalculaFonsaremisleading– Maybeimpossible(example:nobridge)
• Useroadnetwork– UsetransportaFon7me(notdistance)
– Drivingorwalkingusedifferentroutes
!"#$%&'()*+*,-./*0%1234*1567"%*&)8*193&1)48:*;33*%&'()8*%484%<46:**=*
N"%4*;A"E)*)(4*O%"A34B*V*•!!"B#E)4*)(4*%"E)48*"5*)(4*12)E13*%"16*54)D"%I*E8&5'*5+/8')7&'.)
–!Q4^E&%48*1*%"E)1A34*%"16*54)D"%I*'%1#(*–!N1$*2"58&64%*)%192*8)1F8F28*
•!L446*)"*2"B#E)4*L*X8)"%48Y*Z*N*X2E8)"B4%8Y*+19#'.)[*–!W"%*.---*8)"%48*156*.-*B&33&"5*2E8)"B4%8H*)(1)*B4158*:;)3/,,/1()%"E)4*2132E31F"58[*–!!"8)3$H*4<45*"5*#"D4%?E3*(1%6D1%4*
•! _"E*215*#%4`8"%)*)(4*61)1*–!>4'B45)*2E8)"B4%8*156*8)"%48*E8&5'*a"%"5"&*6&1'%1B8*–!bE)*)(1)*&)843?*&8*1*2"8)3$*#%"2488*
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
TheApproach1. Determinethegeographicloca7onofeachstoreandcustomerfromtheir
addresses– ThisisaprocesscalledGeocoding.– UseOracleSpaFal’sGeocodertodeterminetheroadsegmentforeachaddress
2. DeterminetheReverseNetworkBufferforeachstore– ContainsalltheroadsegmentsfromwhichastorecanbereachedwithinasetFme– UseOracleSpaFal’sNetworkDataModelviaitsJavaAPI
3. CorrelatetheresulFngroadsegmentsusingaregularjoinoperaFon– CanbedoneinasingleSQLstatement,thankstoSQLAnalyFcsfuncFons.
!"#$%&'()*+*,-./*0%1234*1567"%*&)8*193&1)48:*;33*%&'()8*%484%<46:**=*
Q"16*L4)D"%I8*156*;66%48848*&5*0%1234*•! Q4?4%4524*61)1*1<1&31A34*?%"B*@fQf*X1I1*L1<)4^Y*
–! ;<1&31A34*&5*R0%1234*M1)1*W"%B1)R*18*C%158#"%)1A34*C1A348#1248*
–! ;<1&31A34*?"%*B"8)*2"E5)%&48*D"%36D&64*
•! C(&8*61)1*84)*2"5)1&58T*–! C(4*+1"5)('#?1+@H*2"B#34)4*D&)(*'4"B4)%&48H*2"5542F<&)$*156*1g%&AEF"5*
–! ;*6&2F"51%$*"?*.#+''#)("&'.)156*1g%&AE)48*?"%*'4"2"6&5'*
–! K4"'%1#(&2*61)1*?"%*8/.9",/F"71(G)16B&5*A"E561%&48H*D1)4%*A"6&48H**#1%I8HH*#"&5)8`"?`&5)4%48)H*V*
•! 0#45*61)1*B"643*–! f12(*%"16*84'B45)*&523E648*1g%&AE)48*?"%*345')(H*)%1<43*FB4H**122488&A&3&)$*
–! WE33$*6"2EB45)46*156*4Z)458&A34H*?"%*4Z1B#34*D&)(*$"E%*"D5*1g%&AE)48*
•! ;38"*1<1&31A34*?%"B*")(4%*8E##3&4%8*X4:':*C"B)"BY*
!"#$%&'()*+*,-./*0%1234*1567"%*&)8*193&1)48:*;33*%&'()8*%484%<46:**=*
M4B"*156*>1B#34*M1)1*;<1&31A34*"5*0CL*
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
LoadingCustomerAddressesfromaCSVFile
• TheCSVFilesareindirectory“address_data_dir”
• Canprocessanynumberoffiles
• Definedasan“ExternalTable”• Loadthemusinga“CreateTableAsSelect”(CTAS)statement.
• Orusedirectly
CREATE TABLE customer_addresses (
customer_id NUMBER,
housenumber VARCHAR2(10),
streetname VARCHAR2(50),
city VARCHAR2(30),
state VARCHAR2(20),
zip VARCHAR2(10)
) ORGANIZATION EXTERNAL (
TYPE ORACLE_LOADER
DEFAULT DIRECTORY address_data_dir
ACCESS PARAMETERS (
RECORDS DELIMITED BY NEWLINE
FIELDS TERMINATED BY ‘,’
)
LOCATION ('customers.csv')
);
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
Geocoding• Convertastreetaddressintogeographicalcoordinates
– Inputcanbestructuredorunstructured
• Coordinatescanbeexact(“addresspoints”)orcomputedbyinterpolaFon
• Alsoreturnstheroadsegmentid(EDGE_ID)andrelaFvelocaFon(PERCENT)• TwoavailableAPIs
– PL/SQL:bestforbatchprocessing–useittogeocodeallyourexisFngcustomersorstores.Cantakeadvantageofparallelism.
– XMLWebService:useittogeocodeaddressesprovidedbywebapplicaFons
• AstandardfeatureofOracleSpaFalandGraphOpFon
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
ExampleofaGeocodingOperaFon
12
SELECT SDO_GCDR.GEOCODE(
'ODF_NA_Q312',
SDO_KEYWORDARRAY('1500 Clay Street', 'San Francisco, CA'),
'US', 'DEFAULT'
)
FROM DUAL;
SDO_GEO_ADDR(0, SDO_KEYWORDARRAY(), NULL, 'CLAY ST', NULL, NULL,
'SAN FRANCISCO', 'SAN FRANCISCO', 'CA', 'US', '94109', NULL,
'94109', NULL, '1500', 'CLAY', 'ST', 'F', 'F', NULL, NULL,'L’,.8, 198676903,
'??X?#ENUT?B281CP?', 1, 'DEFAULT', -122.418, 37.79277,
'??010101010??004?', 8307)
Schemathatcontainstheaddressdata
UnstructuredAddress
Countrycode
Segment,posi7on
Matchcodes
GeographicCoordinates
!"#$%&'()*+*,-./*0%1234*1567"%*&)8*193&1)48:*;33*%&'()8*%484%<46:**=*
O1%13343*O+&546*C1A34*WE52F"58*•!;33"D8*$"E*)"*'4"2"64*B&33&"58*"?*166%48848*A$*4Z#3"&F5'*#1%13343&8B*
•! G5#E)*)"*)(4*#+&546*?E52F"5*&8*1*$9+.1+)–!!15*#188*15$*>fef!C*8)1)4B45)*E8&5'*1*!\Q>0Q*4Z#%488&"5**
•!0E)#E)*&8*1*#"3,')–!!18)*18*1*%4'E31%*%431F"513*)1A34*E8&5'*)(4*C;befXY*2"58)%E2):*–!!15*?446*)(4*%48E3)*&5)"*1*)1A34*
•!M1)1*k"D8*2"5F5E"E83$*?%"B*&5#E)*)"*"E)#E)**
Pipelined Function !O2K62)Q)KRIR!S)T)
U26D)TV)S*AIR)QTV)
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
InsidethePipelinedFuncFon:ResultTypes• DefineanobjecttypeandarraytoholdtheresultofthefuncFon
CREATE OR REPLACE TYPE gc_result AS OBJECT (
id NUMBER,
geo_addr sdo_geo_addr
);
/
CREATE OR REPLACE TYPE gc_results AS TABLE OF gc_result;
/
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
InsidethePipelinedFuncFon
• ThefuncFonisdeclaredasPIPELINEDandPARALLEL_ENABLE• Itsinputisacursor,itsoutputisatable
CREATE OR REPLACE function geocode_pipe (source_table_cursor IN SYS_REFCURSOR)
RETURN gc_results
DETERMINISTIC
PIPELINED
PARALLEL_ENABLE (PARTITION source_table_cursor BY ANY)
IS
...
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
InsidethePipelinedFuncFon
IS
TYPE number_table IS TABLE OF NUMBER;
TYPE string_table IS TABLE OF VARCHAR2(256);
id_table number_table;
line_1_table string_table;
line_2_table string_table;
gc gc_result := gc_result(0,null);
Stringandnumbertable
types
Tablestofetchtheinputfromthecursor
Geocodingresult
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
InsidethePipelinedFuncFonBEGIN
LOOP
FETCH source_table_cursor
BULK COLLECT INTO id_table, line_1_table, line_2_table
LIMIT 1000;
EXIT WHEN id_table.count = 0;
FOR i IN id_table.. id_table LOOP
gc.id := id_table (i);
gc.geo_addr := SDO_GCDR.GEOCODE (‘ODF_NA_Q312’,
sdo_keywordarray(line_1_table(i), line_2_table(i)), ‘US’, 'default’);
PIPE ROW (gc);
END LOOP;
END LOOP;
CLOSE source_table_cursor;
END; Sendtheresulttothecaller
Fetchtheinputaddressesin
batchesof1000
Geocodeeachaddress
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
UsingthePipelinedFuncFon
• ScalesperfectlyonExadata,butalsoonconvenFonalhardware• GeocodingUSaddressesonExadataX4-2Half-Rack:over23,000addresses/second• Geocode1millioncustomeraddressesinlessthanaminute
CREATE TABLE customers_gc as
SELECT /*+ parallel (16) */
g.id customer_id, g.geo_addr.longitude, g.geo_addr.latitude,
g.geo_addr.edgeid link_id, g.geo_addr.percent percentage
FROM TABLE(geocode_pipe(
CURSOR(SELECT customer_id, housenumber ||' '|| streetname,
city ||' '|| state ||' '|| zip
FROM customer_addresses)
)
) g;
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
GeocodingResult:TableCUSTOMERS_GC• CUSTOMER_IDisthecustomeridenFfier
• LINK_IDistheidenFfierofthenetworklink(roadsegment)onwhichthecustomerislocated
• PERCENTAGEistherelaFvelocaFononthatlink(0to1).
Name Type
----------------- --------
CUSTOMER_ID NUMBER
LONGITUDE NUMBER
LATITUDE NUMBER
LINK_ID NUMBER
PERCENTAGE NUMBER
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
NetworkDataModel• Datamodelformanagingandanalyzingnetworkgraphs
– Roads,transportaFon,uFliFes,...• Formedwithnodesandlinks
– Directedorun-directed– Withorwithoutgeometries– Anyuseragributes(costs,accessrestricFons,...)
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
NetworkDataModel:Analysis
• Operatesonamemory-residentcopyofthenetwork– Networksubsets(parFFons)loaded“ondemand”
• NetworkanalysisfuncFonsaccessedviaaJavaAPI
Memory-resident Network Network Analysis
API
Persistent Network
User Application
!"#$%&'()*+*,-./*0%1234*1567"%*&)8*193&1)48:*;33*%&'()8*%484%<46:**=*
L4)D"%I*M1)1*N"643T*;513$8&8*WE52F"58*•! >("%)48)*#1)(*1513$8&8*
•!L41%48)*54&'(A"%*1513$8&8*
•!p&)(&5*2"8)*1513$8&8**
•!L4)D"%I*bE]4%*X?"%D1%6*156*%4<4%84Y*
•! C%1<43&5'*81348B15*#%"A34B*
•!Q412(1A347Q412(&5'*5"648*
•!n`8("%)48)*#1)(8*1513$8&8*
•!V**
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
UsingtheNDMJavaAPIforSolvingourProblem! Prepareandsetup
– CreateaNetworkAnalystobject! DefineacustomcostfuncFon
– computelinktravelFmefromlinklengthandspeedlimit
! Determinethecatchmentareaforeachstore(usingreachingNetworkBuffer)– Findsallroadsegmentsinthecatchmentarea
! Storetheresultsinatableforfurtheranalysis– AllroadsegmentsidenFfierswithdriveFmeforeachcatchmentarea
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
PreparetoUseNDMAPI
private static NetworkIO networkIO;
private static NetworkAnalyst analyst;
// Open database connection
conn = LODNetworkManager.getConnection(dbUrl, dbUser, dbPassword);
// Get network input/output object
networkIO = LODNetworkManager.getCachedNetworkIO(
conn, networkName, networkName, null);
// Get network analyst
analyst = LODNetworkManager.getNetworkAnalyst(networkIO);
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
DriveTimeCostFuncFon• Eachlinkhasa“speedlimit”agribute(inm/sec)
• FuncFoncomputeslinkdriveFmefromlinklengthandspeedlimit
• TellthenetworkanalysttousethecostfuncFon
public class LinkTravelTimeCalculator implements LinkCostCalculator {
int [] defaultUserDataCategories = {UserDataMetadata.DEFAULT_USER_DATA_CATEGORY};
public LinkTravelTimeCalculator () {}
public double getLinkCost(LODAnalysisInfo analysisInfo) {
LogicalLink link = analysisInfo.getNextLink();
double speed = ((Double)link.getUserData(0).get
(RouterPartitionBlobTranslator11gR2.USER_DATA_INDEX_SPEED_LIMIT)).doubleValue();
return (link.getCost()/speed);
}
LinkCostCalculator[] lccs = {new LinkTravelTimeCalculator};
analyst.setLinkCostCalculators(lccs);
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
ReverseNetworkBuffer
• Computeareversenetworkbufferforeachstore
• Savethebuffer
• BuffersarenowintableSTORE_BUFFER_NBL$
PointOnNet startPoint = new PointOnNet(storeLinkId, storePercentage);
PointOnNet[] startPoints = {startPoint};
NetworkBuffer buffer = analyst.reachingNetworkBuffer(startPoints, cost, null);
tableNamePrefix = “STORE_BUFFER_”;
networkIO.writeNetworkBuffer(buffer, storeId, tableNamePrefix);
Thiscomesfromthegeocoder(whenwegeocodedthestore’s
address)
Drive7me(inseconds)
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
NetworkBufferResult:TableSTORE_BUFFER_NBL$• BUFFER_IDisthestoreidenFfier• Percentagesarebetween0and1• Start%usually0,end%usually1
– exceptforboundarylinks• Costfromstartnodetostore.
• Costfromendnodetostore.
Name Type
----------------- --------
BUFFER_ID NUMBER
LINK_ID NUMBER
START_PERCENTAGE NUMBER
END_PERCENTAGE NUMBER
START_COST NUMBER
END_COST NUMBER
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
PuPngitallTogether(1)
step1 AS (
SELECT c.customer_id AS c_customer_id,
c.link_id AS c_link_id,
c.percentage AS c_percentage,
b.buffer_id AS b_buffer_id,
b.link_id AS b_link_id,
b.start_percentage AS b_start_percentage,
b.end_percentage AS b_end_percentage,
b.start_cost AS b_start_cost,
b.end_cost AS b_end_cost
FROM customers c, store_buffer_nbl$ b
WHERE c.link_id = b.link_id
),
CREATE TABLE store_customer_distances AS
WITH
Step1Joincustomerlinkswithbufferlinks
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
PuPngitallTogether(2)step2 AS (
SELECT c_customer_id, b_buffer_id,
b_start_cost + (c_percentage-b_start_percentage)*(b_end_cost- b_start_cost)/(b_end_percentage-b_start_percentage) cost
FROM step1
WHERE c_percentage BETWEEN b_end_percentage AND b_start_percentage
),
step3 AS (
SELECT c_customer_id, b_buffer_id, cost,
row_number() over (partition by c_customer_id order by cost) rn
FROM step2
)
SELECT /*+ parallel 16 */ customer_id, buffer_id as store_id, cost/60 as minutes
FROM step3
WHERE rn = 1;
Step2Compute
costforeachcustomer
Step3Chooselowest
cost
FinalSaveresults
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
TheResult:TableSTORE_CUSTOMER_DISTANCES
• Onerowpercustomer
• IdenFfierofneareststoreforthatcustomer
• TravelFmefromthatcustomertothestore,inminutes
Name Type
----------------- --------
CUSTOMER_ID NUMBER
STORE_ID NUMBER
MINUTES NUMBER
Readyforfurtheranalyzes
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
VisualizingtheNetwork(MapViewer)
SimpleNetworkView ViewinOracleMaps
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
Conclusions• Theexampleshowsthatyoucanperformcomplexanalysisefficiently
– Correlatehundredsofstoresandmillionsofcustomers
• Thetechnologiesusedcanbeappliedtomanycases– GreatpotenFalforspaFalanalyses,datawarehousing,...
• Tryityourselfusingaready-for-usevirtualmachine(here)– Database,sampledata,scripts,javacode,...
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|
TheSpaFalandGraphSIG
• TheSIGpromotesinteracFonandcommunicaFonthatcandrivethemarketforspaFaltechnologyanddata
• MembersconnectandexchangeknowledgeviaonlinecommuniFesandatannualconferencesandevents
• MeetusattheSummit
• MorningRecepFons
• TuesdayandWednesday/7:45to8:30a.m./RegistraFonArea
• BirdsofaFeatherSession• Tuesday/12to1p.m./LunchRoom
• Joinusonline• LinkedIn(searchfor“LinkedInOracleSpaFal”)• Google+(searchfor“Google+OracleSpaFal”)• IOUGSIG(signupforfreemembershipthroughwww.ioug.org)
• OTNSpaFal–CommuniFes(searchfor“OracleSpaFalandGraphCommunity”)
!"#$%&'()*+*,-./*0%1234*1567"%*&)8*193&1)48:*;33*%&'()8*%484%<46:**=*
Q48"E%248*•!0%1234*C42(5"3"'$*L4)D"%IT**
DDD:"%1234:2"B7)42(54)D"%I761)1A1847"#F"5878#1F13156'%1#(*DDD:"%1234:2"B761)1A1847A&'`61)1`8#1F13`156`'%1#(***
•!A3"'8:"%1234:2"B*""%12348#1F13*"*"%1234hB1#8hA3"'*"A&'61)18#1F13'%1#(*
!a!*>#1F13*\#61)4*?"%*MpQ*
!"#$%&'()*+*,-./*0%1234*1567"%*&)8*193&1)48:*;33*%&'()8*%484%<46:**=*
Q48"E%248*"5*b&'*M1)1*>#1F13*156*K%1#(*•!0%1234*b&'*M1)1*>#1F13*156*K%1#(*"5*0%1234:2"BT***
(g#8T77DDD:"%1234:2"B761)1A1847A&'`61)1`8#1F13`156`'%1#(**
•!0CL*#%"6E2)*#1'4*X)%&13*8"yD1%4*6"D53"168H*6"2EB45)1F"5YT***(g#T77DDD:"%1234:2"B7)42(54)D"%I761)1A184761)1A184`)42(5"3"'&487A&'61)1`8#1F13156'%1#(**
•!b3"'**X)42(5&213*4Z1B#348*156*F#8YT****(g#8T77A3"'8:"%1234:2"B7A&'61)18#1F13'%1#(7**
•!b&'*M1)1*e&)4*a&%)E13*N12(&54*X1*?%44*8156A"Z*45<&%"5B45)*)"*'4)*8)1%)46YT****(g#T77DDD:"%1234:2"B7)42(54)D"%I761)1A1847A&'61)1`1##3&15247"%1234`A&'61)13&)4`,.-/t,z:()B3***
Uu*
Copyright©2014Oracleand/oritsaffiliates.Allrightsreserved.|