StartingWithConvertigoWebIntegrator 5.2.0
description
Transcript of StartingWithConvertigoWebIntegrator 5.2.0
-
StartiWeb
Quick G
CEMS 5.2.0ng with Convertigo Integrator
uide
-
TABLE OF CONTENTS
Getting Fami.........
CWI PurpCWI Conc
GeneCWI ODefini
CST
CWI StudGeneView
P
P
C
My First Conv
Web IntegProjecOpen
Setting Ui
liar with Convertigo Web Integrator...................................................1
ose ......................................................................... 1-1epts and Objects ................................................... 1-2
ral Concept............................................................................. 1-2bjects.................................................................................... 1-2
tions ....................................................................................... 1-3onnector................................................................................................... 1-3creen Class.............................................................................................. 1-4ransaction ................................................................................................ 1-5
io............................................................................. 1-7ral Presentation...................................................................... 1-7Description ............................................................................. 1-8
rojects View ............................................................................................. 1-9roperties View ........................................................................................ 1-10onnector View........................................................................................ 1-14
ertigo Web Integrator Project .......2
ration Project ......................................................... 2-1t Description.......................................................................... 2-1
ing the Sample Project from the Studio ................................. 2-4p the Project............................................................ 2-5
-
ii
TABLE OF CONTENTS
Creating a Project and Setting Connector Parameters................... 2-6Defining Screen Classes .............................................................. 2-10
Root Screen Class ................................................................................... 2-10Additional Screen Classes ....................................................................... 2-15Case of a Country-Specific Google Search Page - Definition of the googleSearchPageFR Screen Class .................................................... 2-21
Creating an Extraction Rule.......................................................... 2-24Defining a Transaction.................................................................. 2-40
Case of a Country-Specific Google Search Page - Definition of the o
Execu
Exposing ProData...
ExposingExposExpos
XSL AJAXStarting with Convertigo Web Integrator - CEMS 5.2.0
ngoogleSearchPageFREntry Screen Class Handler ....................... 2-56ting a Transaction ............................................................... 2-69
jects as Web Services & Presenting.................................................3
Projects as Web Services ...................................... 3-1ing a Project as a REST Web Service .................................. 3-1ing a Project as a SOAP Web Service.................................. 3-2 Presentation....................................................... 3-18
-
LIST OF FIGURES
Figure 1 - 1: CWI ex
Figure 1 - 2: CWI ob
Figure 1 - 3: Conne
Figure 1 - 4: Screen
Figure 1 - 5: Criterio
Figure 1 - 6: Transa
Figure 1 - 7: "On en
Figure 1 - 8: List of
Figure 1 - 9: CWI S
Figure 1 - 10: Forma
Figure 1 - 11: Proper
Figure 1 - 12: Text pr
Figure 1 - 13: Javasc
Figure 1 - 14: Text pr
Figure 1 - 15: Javasc
Figure 1 - 16: Edition
Figure 1 - 17: Config
Figure 1 - 18: Text pr
Figure 1 - 19: Selecti
Figure 1 - 20: Conne
Figure 1 - 21: Conneiii
poses any Web application as SOAP or REST Web service ................... 1-1
jects .......................................................................................................... 1-3
ctors folder ................................................................................................. 1-4
classes and Inherited screen classes folders .......................................... 1-4
n and inherited criterion ............................................................................ 1-5
ctions folder ............................................................................................... 1-6
try" and "on exit" screen class handlers .................................................... 1-7
statements ................................................................................................. 1-7
tudio ........................................................................................................... 1-8
tting of unsaved, inherited and disabled objects in the Projects View ..... 1-10
ties View .................................................................................................. 1-11
operty field edition ................................................................................... 1-11
ript expression field edition ...................................................................... 1-12
operty edition via edition window ............................................................ 1-12
ript expression window ............................................................................ 1-12
window ................................................................................................... 1-13
uration window ......................................................................................... 1-13
operty edition via drop-down menu ......................................................... 1-13
ng a value in the drop-down menu .......................................................... 1-14
ctor View (Design tab selected) ............................................................... 1-15
ctor View (Output tab selected) ............................................................... 1-16
-
iv
LIST OF FIGURES
Figure 1 - 22: DOM Tree ........................................................................................................... 1-17
Figure 1 - 23: HTML element selected in the Web Browser ..................................................... 1-18
Figure 1 - 24: XPath Evaluator ................................................................................................. 1-18
Figure 2 - 1: Connection to Google HTTP server ..................................................................... 2-2
Figure 2 - 2: Input of a search keyword in Google search field ................................................ 2-2
Figure 2 - 3: Click on Google Search button ............................................................................. 2-3
Figure 2 - 4: Extraction of Web page titles and URLs in XML format ....................................... 2-3
Figure 2 - 5: Click o
Figure 2 - 6: Proces
Figure 2 - 7: Launch
Figure 2 - 8: Selecti
Figure 2 - 9: Sample
Figure 2 - 10: New P
Figure 2 - 11: Selecti
Figure 2 - 12: Enterin
Figure 2 - 13: Enterin
Figure 2 - 14: Setting
Figure 2 - 15: New pr
Figure 2 - 16: sample
Figure 2 - 17: DOM T
Figure 2 - 18: Google
Figure 2 - 19: Detect
Figure 2 - 20: Expand
Figure 2 - 21: Genera
Figure 2 - 22: XPath ..........
Figure 2 - 23: New S
Figure 2 - 24: Giving
Figure 2 - 25: New sc
Figure 2 - 26: SelectiStarting with Convertigo Web Integrator - CEMS 5.2.0
n the Next button ....................................................................................... 2-4
s stops on Google page with no Next button ............................................ 2-4
ing the New Project wizard ....................................................................... 2-5
ng the CWI sample project ........................................................................ 2-5
project in Projects View ........................................................................... 2-5
roject wizard ............................................................................................... 2-6
ng a project type ........................................................................................ 2-7
g a project name ....................................................................................... 2-7
g a connector name .................................................................................. 2-8
HTTP or HTML connector parameters ..................................................... 2-8
oject summary window .............................................................................. 2-9
Google project appearing in the Projects View ......................................... 2-9
ree and Web Browser ............................................................................. 2-10
general search page .............................................................................. 2-11
ion criterion of the googleWebPages parent screen class ....................... 2-11
ing an element to find relevant nodes and attributes in the DOM Tree .. 2-12
ting XPath from class and text attributes ................................................ 2-12
expression of the detection criterion of the googleWebPages screen class .................................................................................................................... 2-13
creen Class wizard .................................................................................. 2-13
a name to the new screen class .............................................................. 2-14
reen class appearing in the Projects View .............................................. 2-14
on of a detection criterion for the googleSearchPage screen class ........ 2-16
-
LIST OF FIGURES
Figure 2 - 27: Attribute nodes of the HTML element selected as detection criterion for the googleSearchPages screen class ...................................................................... 2-16
Figure 2 - 28: googleSearchPage screen class in Inherited screen classes folder .................. 2-17
Figure 2 - 29: googleSearchPage XPath evaluation on a Google result page ......................... 2-18
Figure 2 - 30: Selection of a detection criterion for the googleResultsPage screen class ........ 2-18
Figure 2 - 31: Searching for a better detection criterion ........................................................... 2-19
Figure 2 - 32: Generation of XPath for the googleResultsPage screen class .......................... 2-20
Figure 2 - 33: googleResultsPage screen class in Inherited screen classes folder .................. 2-20
Figure 2 - 34: Examp
Figure 2 - 35: Examp
Figure 2 - 36: Selecti
Figure 2 - 37: Expandspecific
Figure 2 - 38: XPath
Figure 2 - 39: google
Figure 2 - 40: New E
Figure 2 - 41: Giving
Figure 2 - 42: HTML
Figure 2 - 43: google
Figure 2 - 44: Extract
Figure 2 - 45: Selecti
Figure 2 - 46: Root X
Figure 2 - 47: Root X
Figure 2 - 48: Genera
Figure 2 - 49: Google
Figure 2 - 50: Creatin
Figure 2 - 51: New ro
Figure 2 - 52: Row pr
Figure 2 - 53: Creatin
Figure 2 - 54: New cov
le of country-specific non-anglophone Google search page (France) .... 2-21
le of country-specific anglophone Google search page (UK) .................. 2-22
on of a detection criterion for the country-specific screen class .............. 2-23
ing an element to find relevant attributes in the DOM Tree for the country- screen class .......................................................................................... 2-23
expression of country-specific redirecting link ......................................... 2-24
SearchPageFR screen class in Projects View ........................................ 2-24
xtraction Rule wizard ............................................................................... 2-26
a name to the new extraction rule ........................................................... 2-26
tables definition ........................................................................................ 2-27
Results extraction rule in Extraction rules folder ..................................... 2-27
ion rule properties to be set ..................................................................... 2-28
on of an XPath for Google result items .................................................... 2-29
Path expression and tree ........................................................................ 2-29
Path expression and expanded tree ........................................................ 2-30
ting the table row XPath corresponding to Google results ..................... 2-30
results XPath based upon root XPath .................................................... 2-30
g a new row ............................................................................................ 2-31
w ............................................................................................................. 2-31
operties ................................................................................................... 2-31
g a new column ....................................................................................... 2-32
lumn ........................................................................................................ 2-32
-
vi
LIST OF FIGURES
Figure 2 - 55: Column properties .............................................................................................. 2-32
Figure 2 - 56: Extraction rule columns in Projects Folder ......................................................... 2-33
Figure 2 - 57: Expanding the LI element to find an XPath corresponding to Web page titles .. 2-33
Figure 2 - 58: H3 node corresponding to a Web page title ....................................................... 2-34
Figure 2 - 59: Row XPath expression and tree ......................................................................... 2-34
Figure 2 - 60: Generating the XPath expression of Web page titles ......................................... 2-35
Figure 2 - 61: Web page titles XPath based upon row XPath .................................................. 2-35
Figure 2 - 62: Selecti
Figure 2 - 63: Row X
Figure 2 - 64: Genera
Figure 2 - 65: URL X
Figure 2 - 66: Genera
Figure 2 - 67: Result
Figure 2 - 68: Structu
Figure 2 - 69: New T
Figure 2 - 70: Enterin
Figure 2 - 71: New tr
Figure 2 - 72: Creatin
Figure 2 - 73: New V
Figure 2 - 74: Giving
Figure 2 - 75: New va
Figure 2 - 76: Variab
Figure 2 - 77: New tr
Figure 2 - 78: New en
Figure 2 - 79: Selecti
Figure 2 - 80: Expans
Figure 2 - 81: Prevou..........
Figure 2 - 82: New S
Figure 2 - 83: ProperStarting with Convertigo Web Integrator - CEMS 5.2.0
on of a node for generating the XPath of URLs ....................................... 2-36
Path expression and tree ......................................................................... 2-36
ting the XPath expression of URLs ........................................................ 2-37
Path based upon row XPath .................................................................... 2-37
tion of XML document ............................................................................ 2-38
of the XML generation displayed under the Output tab ........................... 2-39
re of the XML output ............................................................................... 2-39
ransaction wizard ..................................................................................... 2-41
g a new transaction name ....................................................................... 2-41
ansaction in Projects View ....................................................................... 2-42
g a new transaction variable ................................................................... 2-42
ariable wizard ........................................................................................... 2-43
a name to the new variable ..................................................................... 2-43
riable in transaction Variables folder ...................................................... 2-43
le properties ............................................................................................. 2-44
ansaction handler window ........................................................................ 2-45
try screen class handler in the Projects View ......................................... 2-46
on of Google search field in Web Browser .............................................. 2-47
ion of the DOM tree to find an appropriate node for the search field ..... 2-47
sly generated XPath corresponds to one element of the Web page only ........................................................................................................................ 2-48
tatement wizard - Input HTML set value .................................................. 2-48
statement name ...................................................................................... 2-49
-
LIST OF FIGURES
Figure 2 - 84: Improper statement name .................................................................................. 2-49
Figure 2 - 85: New Input HTML set value statement ................................................................ 2-49
Figure 2 - 86: HTML Input set value statement properties ....................................................... 2-50
Figure 2 - 87: Edition of the Expression property ..................................................................... 2-50
Figure 2 - 88: Updated statement ............................................................................................. 2-51
Figure 2 - 89: Selection of the Google Search button in the Web Browser .............................. 2-51
Figure 2 - 90: Expansion of the DOM tree to find an appropriate element for the Search button ................................................................................................................................ 2-52
Figure 2 - 91: New S
Figure 2 - 92: New m
Figure 2 - 93: Mouse
Figure 2 - 94: Trigger
Figure 2 - 95: Entry s
Figure 2 - 96: ongoog
Figure 2 - 97: Selecti
Figure 2 - 98: Expand
Figure 2 - 99: Genera
Figure 2 - 100: New S
Figure 2 - 101: New m
Figure 2 - 102: clickLin
Figure 2 - 103: New ex
Figure 2 - 104: Selecti
Figure 2 - 105: Expans
Figure 2 - 106: New S
Figure 2 - 107: New m
Figure 2 - 108: Exit sc
Figure 2 - 109: Google
Figure 2 - 110: Creatio
Figure 2 - 111: Expans
Figure 2 - 112: googlevii
tatement wizard - Mouse action ............................................................... 2-53
ouse click statement on entry screen class handler ................................ 2-53
click statement properties ....................................................................... 2-54
editor window ......................................................................................... 2-54
creen class handler properties ................................................................ 2-56
leSearchPageFR screen class handler in Projects View ....................... 2-57
on of a detection criterion for the googleSearchPageFR screen class .... 2-58
ing an element to find relevant attributes in the DOM Tree .................... 2-58
tion of XPath for the googleSearchPageFR screen class ...................... 2-59
tatement wizard - Mouse action ............................................................... 2-59
ouse click statement on entry screen class handler ................................ 2-60
kGoogleCom statement - Synchronization value ................................... 2-60
it screen class handler in the Projects View ........................................... 2-61
on of the Google results Next link in the Web Browser ........................... 2-62
ion of the DOM tree to find an appropriate node for the Next button ..... 2-62
tatement wizard - Mouse action ............................................................... 2-63
ouse click statement on exit scren class handler .................................... 2-64
reen class handler properties .................................................................. 2-64
page with no Next link ............................................................................ 2-65
n of a new screen class based on the detection of the Next link ............ 2-66
ion of the DOM tree to find an appropriate node for the Next button ..... 2-66
ResultsPageCurrent screen class in the Projects View ........................... 2-67
-
viii
LIST OF FIGURES
Figure 2 - 113: ongoogleResultsPageFinalExit screen class handler in the Projects View ........ 2-67
Figure 2 - 114: ongoogleResultsPageFinal properties ............................................................... 2-68
Figure 2 - 115: Screen class handler properties - Screen class drop-down menu ..................... 2-68
Figure 2 - 116: XML output without accumulate data in same table property set ....................... 2-70
Figure 2 - 117: Extraction rule properties - Accumulate data in same table ............................... 2-71
Figure 2 - 118: XML output with accumulate data in same table property set ............................ 2-72
Figure 2 - 119: Convertigo test platform URL ............................................................................. 2-72
Figure 2 - 120: Test pl
Figure 2 - 121: XML o
Figure 3 - 1: URL sy
Figure 3 - 2: search
Figure 3 - 3: Setting
Figure 3 - 4: Update
Figure 3 - 5: Update
Figure 3 - 6: Transa
Figure 3 - 7: Project
Figure 3 - 8: Project
Figure 3 - 9: Project
Figure 3 - 10: Transa
Figure 3 - 11: URL sy
Figure 3 - 12: SOAP
Figure 3 - 13: Launch
Figure 3 - 14: Web S
Figure 3 - 15: Web S
Figure 3 - 16: URL of
Figure 3 - 17: Web S
Figure 3 - 18: WSDL
Figure 3 - 19: List of
Figure 3 - 20: WSDLStarting with Convertigo Web Integrator - CEMS 5.2.0
atform ...................................................................................................... 2-73
utput in the Execution result window of the test platform ......................... 2-74
ntax for exposing transactions as REST Web services ............................ 3-2
Google transaction properties ................................................................... 3-3
the Public Method property to true ........................................................... 3-4
transaction schema from transaction definition ....................................... 3-5
transaction schema- Confirmation window .............................................. 3-5
ction schema types window ...................................................................... 3-6
folder in Project Explorer .......................................................................... 3-7
schema in Design tab of Editor ................................................................ 3-8
schema in Source tab of Editor ................................................................ 3-9
ction data types in schema ...................................................................... 3-10
ntax for displaying the SOAP Web service WSDL ................................. 3-10
Web service WSDL ................................................................................. 3-11
ing the Web Services Explorer ............................................................... 3-12
ervices Explorer ....................................................................................... 3-12
ervices Explorer ....................................................................................... 3-13
the SOAP service in the Web Services Explorer ................................... 3-14
ervices Explorer panes after a WSDL URL is reached ............................ 3-14
Binding Details in Actions pane of the Web Services Explorer ............... 3-15
methods displayed in the Web Services Explorer ................................... 3-15
service elements displayed in the Navigator pane .................................. 3-15
-
LIST OF FIGURES
Figure 3 - 21: Invoke a WSDL Operation view in the Actions pane .......................................... 3-16
Figure 3 - 22: Invoking a WSDL Operation ............................................................................... 3-16
Figure 3 - 23: Status pane after WSDL operation invocation ................................................... 3-16
Figure 3 - 24: SOAP Request envelope detail .......................................................................... 3-17
Figure 3 - 25: SOAP response envelope detail ........................................................................ 3-17
Figure 3 - 26: New Sheet wizard .............................................................................................. 3-18
Figure 3 - 27: Entering a name for the new style sheet ............................................................ 3-19
Figure 3 - 28: New st
Figure 3 - 29: Project
Figure 3 - 30: New F
Figure 3 - 31: New fil
Figure 3 - 32: XSL st
Figure 3 - 33: XSL st
Figure 3 - 34: Sheet
Figure 3 - 35: Transa
Figure 3 - 36: Transa
Figure 3 - 37: XML oix
yle sheet object in the Projects View ....................................................... 3-19
Explorer - New file .................................................................................. 3-19
ile wizard .................................................................................................. 3-20
e in Project Explorer ................................................................................ 3-20
yle sheet .................................................................................................. 3-21
yle sheet templates .................................................................................. 3-22
properties ................................................................................................. 3-23
ction properties ........................................................................................ 3-24
ction Style Sheet property ....................................................................... 3-24
utput after XSL transformation ................................................................. 3-25
-
xLIST OF FIGURES Starting with Convertigo Web Integrator - CEMS 5.2.0
-
LIST OF TABLES
Table 1 - 1: CWI ob
Table 1 - 2: Conne
Table 1 - 3: Screen
Table 1 - 4: CWI vi
Table 1 - 5: Standa
Table 1 - 6: Project
Table 1 - 7: Proper
Table 1 - 8: Conne
Table 1 - 9: Conne
Table 1 - 10: Dom T
Table 1 - 11: XPath
Table 2 - 1: Table e
Table 2 - 2: Extract
Table 2 - 3: New tr
Table 2 - 4: Synchr
Table 2 - 5: Result
Table 2 - 6: Result
Table 2 - 7: Google
Table 3 - 1: Web S
Table 3 - 2: Style Sxi
jects .......................................................................................................... 1-2
ctor types ................................................................................................... 1-3
class handler types .................................................................................. 1-6
ews ............................................................................................................ 1-8
rd Eclipse buttons ..................................................................................... 1-9
s View tabs and icons ............................................................................... 1-9
ties View icons ......................................................................................... 1-11
ctor View parts ......................................................................................... 1-14
ctor View tabs, icons and field ................................................................. 1-15
ree icons .................................................................................................. 1-17
Evaluator elements .................................................................................. 1-19
xtraction rule - line and column description ............................................ 2-25
ion rule properties ................................................................................... 2-28
ansaction handler window description ..................................................... 2-45
onizer types ............................................................................................. 2-55
property values for entry screen class handlers ...................................... 2-56
property values for exit screen class handlers ........................................ 2-64
result page distinction ............................................................................ 2-65
ervices Explorer ....................................................................................... 3-12
heet transaction property ........................................................................ 3-23
-
xii
LIST OF TABLES Starting with Convertigo Web Integrator - CEMS 5.2.0
-
1 Getting Familiar with Convertigo Web IntegratorThis chapter pre(CWI), one of Co
It also describesicons, menu).
CWI Purpose
CWI Concept
CWI Studio
1.1 CWI Purpo
The purpose of CSOAP or REST and business log
Figure 1 - 1: CWI ex
CWI encompass
Reusing a co
Integrating no1 - 1
sents the purpose, concepts and objects of Convertigo Web Integratornvertigo Enterprise Mashup Studios components.
the four views of the Eclipse-based CWI Studio and their elements (tabs,
s and Objects
se
onvertigo Web integrator (CWI) is to expose Web sites and applications asWeb services, thus allowing any application to programmatically access dataic from the target Web site.
poses any Web application as SOAP or REST Web service
es a large number of applications:
mpanys Web applications assets to build new SOA based architectures.
n intrusively existing enterprise Web applications into SOA.
-
1 - 2
Chapter "Getting Familiar with Convertigo Web Integrator" CWI Concepts and Objects
Creating APIs on any extranet or public Web application.
Collecting data from Web sites to be inserted in databases.
Exchanging data with any Web application such as eGoverment sites or partner sites.
Integrating Web applications into workflow managers or complex BPM processes.
1.2 CWI Concepts and Objects
This section presents the general concept and notions of the application. It then relates notionsto real CWI objects, which are defined in detail in a dedicated sub-section.
General Concept
CWI Objects
Definitions
1.2.1 Gene
CWI accesses apages, applicatinformation on th
Navigation and eset as part of a W
1.2.2 CWI O
The following tab
All of these objebeen created, thpage 1 - 7):
Table 1 - 1: CWI
Obje
a. Objects are inde(see Figure 1 - 2
Connector
Screen class
Criterion
Extraction ru
Transaction
Screen clas
State
VariableStarting with Convertigo Web Integrator - CEMS 5.2.0
ral Concept
target HTML Web site and navigates through HTML Web screens (Webion forms, etc.) following variable-based transactions while extractinge fly. During the extraction process, information is structured into XML format.
xtraction depend on events, actions and rules based on CWI objects, that areeb integration project.
bjects
le describes CWI objects:
cts are set when developing a Web integration project. Once the project hasese objects appear in the Projects View of the CWI Studio (CWI Studio on
objects
cta
nted in the column according to their indentation level in the Project tree of the Projects View )
Description
Defines the type of application (HTTP, HTML, etc.) CWI connects to.
Defines a family of HTML pages with common characteristics.
Is a characteristic element defining a specific screen class.
le Is a rule defining how information is extracted from a Web screen.
Defines the actions Convertigo must perform on HTML pages.
s handler Is a set of statements triggered on detection of a specific screen class.
ment Is an action or set of actions performed on HTML pages.
Is the transaction input variable.
-
Chapter "Getting Familiar with Convertigo Web Integrator"CWI Concepts and Objects
Figure 1 - 2: CWI ob
1.2.3 Defin
CONNECTOR
A connector defi
Two types of con
TC
Table 1 - 2: Conn
Connector
HTTP Connector
HTML connector1 - 3
jects
itions
nes the type of application or data source CWI is able to connect to.
nectors are available in CWI:
o get started with your first Web integration project, see "My Firstonvertigo Web Integrator Project" on page 2-1
ector types
type Description
Enables Convertigo to connect to any HTTP flow.
Enables Convertigo to connect to any HTML based application.
-
1 - 4
Chapter "Getting Familiar with Convertigo Web Integrator" CWI Concepts and Objects
Connector parameters are set in the New Project wizard which opens when creating a Webintegration project (see "Creating a Project and Setting Connector Parameters" on page 2-6).
Once a new project has been created, the corresponding connector appears in the Connectorsfolder of the project, in the Projects View of the CWI Studio (CWI Studio on page 1 - 7):
Figure 1 - 3: Connectors folder
SCREEN CLASS
Screen classes common charact
uniquely iden
associated to
A single root screproject The defarule. It is parent
A screen class bfor instance) is cparent screen cla
Inherited screenView of the CWI
Figure 1 - 4: Screen
CRITERION
A Web screen isby the user for Starting with Convertigo Web Integrator - CEMS 5.2.0
define families of Web screens (Web page, application form, etc.) witheristics (called criteria). Screen classes are:
tified by at least one criterion;
extraction rules.
en class, called Default screen class, is created by default when creating ault screen class is not - but can be - associated to any criterion or extractionto all other screen classes.
ased on a parent screen class (defined with extra criteria and extraction rulesalled Inherited screen class. It inherits criteria and extraction rules from itsss.
classes appear in subfolders named Inherited screen classes in the Projects Studio (CWI Studio on page 1 - 7):
classes and Inherited screen classes folders
considered belonging to a screen class when it matches all criteria definedthis screen class. If no criterion matches, the Web screen is considered
-
Chapter "Getting Familiar with Convertigo Web Integrator"CWI Concepts and Objects
belonging to the default screen class (provided that the default screen class is not associatedto any detection criterion).
Criteria are set using the XPath language, which addresses parts of an XML document; anXPath can be, for example, the path to a logo, a field, a word or any other identifying elementon a Web screen. If at least one element is adressed by the XPath on the HTML page, thecriterion matches.
XPaths are managed in the CWI Studio using the XPath Evaluator, which is part of theConnector View of the CWI Studio (CWI Studio on page 1 - 7).
Figure 1 - 5: Criterio
EXTRACTION RULE
Extraction rules dare used to gene
During a transacscreen class mafrom its inherited
TRANSACTION
Web page navigstatements) perfassociated with t
A transaction ca
taking variabl
producing an
When executingthe JavaScript sccan be set as inpvariable setting,
Once a transac
Fa1 - 5
n and inherited criterion
efine which data must be extracted from the target application and how. Theyrate parts of the XML output resulting from a transaction execution.
tion process, when a screen class is detected (i.e. all criteria defined for thistch), all screen class-specific extraction rules are executed, including those screen classes.
ation is performed using HTML transactions, that is a series of actions (calledormed from one Web screen to another. These actions are triggered by eventshe detected screen class, using screen class handlers.
n be summed up as a process:
es in input;
XML document in output.
the transaction, transaction variables are automatically added as variables ofope. Transaction variables can be given a fixed (default) value, or their valueut parameter prior to executing a transaction (for an example of transaction
see "To set a transaction variable" on page 2-42).
tion has been defined, it appears in the Transactions folder of a given
or more information on XPath, see the XPath tutorial from W3Schoolst http://www.w3schools.com/XPath/.
-
1 - 6
Chapter "Getting Familiar with Convertigo Web Integrator" CWI Concepts and Objects
connector, in the Projects View of the CWI Studio (CWI Studio on page 1 - 7):
Figure 1 - 6: Transactions folder
SCREEN CLASS HA
Basically, a screaccessed this sc
In other words, user-defined critdefined actions.
In CWI, two type
Once a screen transaction, in th
Table 1 - 3: Scre
Screen clas
on
on
OStarting with Convertigo Web Integrator - CEMS 5.2.0
NDLER
en class handler is meant to answer the following question : "Now that I havereen, what am I supposed to do?".
once a given screen class has been accessed and detected (depending oneria), the corresponding screen class handler is triggered and performs user-
s of screen class handler can be defined for a given screen class:
class handler has been created, it appears in the Functions folder of thee Projects View of the CWI Studio (CWI Studio on page 1 - 7):
en class handler types
s handler Description
Entry Triggered when CWI detects a given .Is identified by a symbol on the left of the screen class handler name
(see Figure 1 - 7).
Exit Triggered after extraction rules have been executed for a given .
Is identified by a symbol on the left of the screen class handler name
(see Figure 1 - 7).
ther handlers exist, but their use falls out of the scope of this guide.
-
Chapter "Getting Familiar with Convertigo Web Integrator"CWI Studio
Figure 1 - 7: "On en
A screen class h
For example: Ty
Figure 1 - 8: List of s
STATEMENT
A statement is athat the screen cdata inputs, etc.
1.3 CWI Studio
This section desView, the Conne
General Pres
View Descrip
1.3.1 Gene
Based on an Ecl
one or severa
icons;
three standar
The four views a1 - 7
try" and "on exit" screen class handlers
andler can contain one or several statements.
pe in a keyword (statement #1) then click on Next (statement #2):
tatements
n action that is performed on entry to or on exit from a Web screen, providedlass of this Web screen has been detected. Statements include mouse clicks,
cribes the CWI Studio, which comprises four distinct panes: the Projectsctor View, the Properties View and the Console.
entation
tion
ral Presentation
ipse platform, the CWI Studio is divided into four views that all include:
l tabs, located either on the upper or on the lower part of the view;
d Eclipse buttons (Menu , Minimize and Maximize )
re illustrated and named in the figure below:
-
1 - 8
Chapter "Getting Familiar with Convertigo Web Integrator" CWI Studio
Figure 1 - 9: CWI St
The following tab
1.3.2 View
All four views co
Table 1 - 4: CWI
View Nam
Projects View
Properties View
Connector View /
Console ViewStarting with Convertigo Web Integrator - CEMS 5.2.0
udio
le describes the views of the Convertigo Web Integrator studio:
Description
ntain common standard Eclipse buttons in the upper right corner of the view:
views
e Description
Displays in a tree structure all CWI projects and their related objects: connectors, screen classes (including criteria, extraction rules and inherited screen classes) and transactions (including screen class handlers and statements).
Displays the properties of the CWI object selected in the Projects View. Used to modify a given object properties.
Editor The Connector View is divided into three parts: the DOM Tree - displays the Document Object Model Tree (logical structure)
of the Web page accessed by CWI; the Web Browser - displays the Web page accessed by CWI, depending on
the connector parameters set for the project; the XPath Evaluator - displays XPath-specific information such as the XPath
expression of a node. Contains shortcuts (icons) for creating new child screen classes, criteria, extraction rules, statements, etc.
It is automatically updated as an Editor whenever files (in editable format such as .xml, .css or .xsl) are edited.
Standard Eclipse console. Displays information about running tasks, errors, etc. Different consoles are available (standard output, Engine, Studio, etc.).
-
Chapter "Getting Familiar with Convertigo Web Integrator"CWI Studio
The following section describes the four panes of the CWI Studio in further detail.
PROJECTS VIEW
The Projects VieInformation appe
In addition to pro9), the Projects
In the Projects V
plain text - un
bold - object
Table 1 - 5: Standard Eclipse buttons
Button Icon Description
Menu Opens a drop-down menu with a content identical to the view toolbar.
Minimize Minimizes the current view without closing it. The view is still visible in the form of an icon that can be clicked to be enlarged again.
Maximize Maximizes the current view to full screen. Maximizing a view can also be achieved by double clicking the upper bar of the view.
Table 1 - 6: Proje
View
Tabs Proje
Proje
Icons1 - 9
w displays all information (objects, folders and files) associated with a project.ar in distinct tree structures depending on the selected tab.
ject tree structures and standard Eclipse buttons (see Table 1 - 5 on page 1-View includes tabs and icons:
iew, objects are formatted depending on their status:
changed object;
changed and unsaved;
cts View tabs and icons
Element Description
cts Convertigo view. Contains all projects and related objects. Objects are presented in a tree structure and contained in folders.
ct Explorer Standard Eclipse view. Contains all projects and related files. Files are represented in a tree structure.
Find icon. Search project tree for a specific object
Decrease priority icon.Available only if an extraction rule is selected. Decreases its priority, thus changing extraction rules execution order for this screen class and its children classes.
Increase priority icon.Available only if an extraction rule is selected. Increases its priority, thus changing extraction rules execution order for this screen class and its children classes.
Save icon.Available only when at least one object has been modified. Saves all the project. All objects will be saved, and project xml file will be updated.Note: Remember to save projects on a regular basis.
Refresh icon.Available only if a project is selected. Refreshes Projects View by reloading project content from the hard drive. Identical to closing and re-opening project.
Import icon.Imports a new project (in CAR or XML format) in the Studio.
-
1 - 10
Chapter "Getting Familiar with Convertigo Web Integrator" CWI Studio
red - disabled object;
orange - not executable object (child of a disabled object);
grey - inherited object.
The figure below shows the different types of formatting in the Projects View.
Figure 1 - 10: Forma
Once an objectProperties View
PROPERTIES VIE
The Properties
Properties depe(Expert, Configuclicking on )Starting with Convertigo Web Integrator - CEMS 5.2.0
tting of unsaved, inherited and disabled objects in the Projects View
has been selected in the Projects View, its properties appear in the.
W
View displays the properties of the object selected in the Projects View.
nd on the type of the selected object. They are presented by categoriesration, Information, Selection, Properties, etc. - categories can be hidden by that can be either expanded or collapsed by clicking respectively on or .
-
Chapter "Getting Familiar with Convertigo Web Integrator"CWI Studio
Figure 1 - 11: Prope
In addition to objProperties View
To edit properti1 Left-click o
2 Depending
the valfield:
Figure 1 - 12: Text p
the va
Table 1 - 7: Prop
Icon1 - 11
rties View
ect properties and standard Eclipse buttons (see Table 1 - 5 on page 1-9), the includes one tab (Properties) and icons:
es n the property value in the Properties View;
on the property type:
ue is highlighted with white background; edit it by typing a new value in the
roperty field edition
lue is highlighted with clear blue background; edit it by typing a new
erties View icons
Description
Show Categories icon. Displays or hides property categories. When categories are hidden, properties appear in alphabetical order.
Show Advanced Properties icon.Displays or hides advanced properties, if relevant. Not used for Convertigo objects properties.
Restore Default Value icon.Available only when a property has been modified. Restores it to default value. Not used for Convertigo objects properties.
-
1 - 12
Chapter "Getting Familiar with Convertigo Web Integrator" CWI Studio
JavaScript expression in the field:
Figure 1 - 13: Javascript expression field edition
the value is higlighted and a symbol appears on the right of the field:
Figure 1 - 14: Text p
Click on
A. A Ja
Figure 1 - 15: JavasStarting with Convertigo Web Integrator - CEMS 5.2.0
roperty edition via edition window
. Depending on the property type:
vascript expression window or an edition window opens:
cript expression window
-
Chapter "Getting Familiar with Convertigo Web Integrator"CWI Studio
Figure 1 - 16: Editio
1 Ent2 ClicB. A c
stat
Figure 1 - 17: Config
1 Set2 Clic
the val
Figure 1 - 18: Text p1 - 13
n window
er the new JavaScript expression or the text comment in the window.
k on OK.onfiguration window opens (in this example, the Trigger Editor window for
ement synchronization):
uration window
the parameters in the different fields and menus.
k on OK.
ue is higlighted and a symbol appears on the right of the field:
roperty edition via drop-down menu
-
1 - 14
Chapter "Getting Familiar with Convertigo Web Integrator" CWI Studio
1 Click on . A drop-down menu appears:
Figure 1 - 19: Selecting a value in the drop-down menu
2 Sele3 Pre
CONNECTOR VIE
The Connector the type of conninto three parts.
The table below
As other views, t
Table 1 - 8: Conn
Name
Dom Tree
Web Browser
XPath EvaluatorStarting with Convertigo Web Integrator - CEMS 5.2.0
ct a value in the drop-down menu.
ss Enter.
W
View displays connector-specific information. This information depends onector set for the project. In the case of HTML connectors, the view is divided
describes these parts:
he Connector View also includes tabs and icons:
ector View parts
Description
Displays the logical structure of the Web page displayed in the Web Browser.
Displays the Web page accessed by CWI.
Displays the XPath expression of HTML elements selected in the Web Browser and result nodes of the written XPath execution on the DOM Tree. Includes shortcuts to create new criteria, statements, screen class, etc.
-
Chapter "Getting Familiar with Convertigo Web Integrator"CWI Studio
Figure 1 - 20: Conne
The table below is selected (by d
Table 1 - 9: Conn
View
Tab Outp
Desi
Icon1 - 15
ctor View (Design tab selected)
describes the Connector View icons that are available when the Design tabefault):
ector View tabs, icons and field
Element Description
ut Displays the XML code generated following either a transaction (according to extraction rules) or a Generate XML action (see Figure 1 - 21).
gn Displays the Dom Tree, Web Browser and XPath Evaluator.
Toggle auto refresh domTree icon.Press this icon to activate the auto refresh domTree function, which automatically refreshes the DOM Tree when a new Web page is displayed in the Web Browser.
Show current screen class icon.Highlights in light grey in the Projects View the screen class detected for the Web page currently displayed in the Web Browser.
Generate XML icon.Generates the XML code of the Web page currently displayed in the Web Browser by executing the related extraction rules.The XML code is displayed under the Output tab of the Connector View.
Stop transaction icon.Stops the current transaction (while a transaction is in process).
Learn icon.Available only when a transaction is selected. Activates the Learn mode - any action manually performed in the HTML Connector View is recorded and HTTP statements are automatically generated in corresponding screen class entry handler. For advanced users only.
-
1 - 16
Chapter "Getting Familiar with Convertigo Web Integrator" CWI Studio
When the Output tab is selected, the view displays XML code:
Figure 1 - 21: Conne
DOM TREE
The DOM Tree mcontains all nodedisplayed in the
root nodes;
element node
text nodes;
attribute node
namespace n
processing in
comment nod
Accumulate learning mode icon.Available only when Learn mode is on. Accumulate data on visited Web screens. Screen class exit handlers are automatically created in the transaction. For advanced users only.
Field Current selection Displays the path to the node selected in the DOM Tree (left-click) or to the HTML element selected in the Web Browser (right-click).
Table 1 - 9: Connector View tabs, icons and field (...)
View Element DescriptionStarting with Convertigo Web Integrator - CEMS 5.2.0
ctor View (Output tab selected)
odels an HTML (or XML) document as a tree structure called node-tree. Its (parents, children and siblings) of the HTML document it models (which is
Web Browser in the CWI Studio):
s;
s;
odes;
struction nodes;
es.
-
Chapter "Getting Familiar with Convertigo Web Integrator"CWI Studio
Figure 1 - 22: DOM
The table below
WEB BROWSER
The Web Browsin the DOM Tree
The Web BrowTarget Server inParameters" on
All HTML elemenbe selected withBrowser, and th
Table 1 - 10: Dom
Icon1 - 17
Tree
describes the Dom Tree icons:
er displays the Web page accessed by CWI. Its logical structure is displayed on the left of the browser.
ser automatically displays the HTML page of the HTTP server defined as the connector parameters (see "Creating a Project and Setting Connector
page 2-6).
ts (image, text, field, etc.) of Web pages displayed in the Web Browser can a right-click. Once selected, they appear outlined in green in the Webeir corresponding element node is highlighted in green in the DOM Tree:
Tree icons
Description
SyncTree icon.Synchronizes the DOM Tree with the HTML page currently displayed in the Web Browser.
Parent node icon.Selects and highlights in green the parent node of the node currently selected in the DOM Tree. Also outlines in green the corresponding HTML element in the Web Browser.
Previous node icon.Selects and highlights in green the node appearing before the node currently selected in the DOM Tree. Also outlines in green the corresponding HTML element in the Web Browser.
Next node icon.Selects and highlights in green the node appearing after the node currently selected in the DOM Tree. Also outlines in green the corresponding HTML element in the Web Browser.
-
1 - 18
Chapter "Getting Familiar with Convertigo Web Integrator" CWI Studio
Figure 1 - 23: HTML
The XPath exprrequired elemenby selecting it wiXPath Evaluato
The Web Brows
standard Brow
an address b
icons to mana
allows to disp
XPATH EVALUATO
The purpose of previously selectresults from runn
The XPath Evalscreen classes,
Figure 1 - 24: XPath
The table below Starting with Convertigo Web Integrator - CEMS 5.2.0
element selected in the Web Browser
ession of selected elements can be generated either by right-clicking thet node in the DOM Tree and selecting Generate selection XPath (Enter) orth a left-click and pressing Enter. The XPath expression then appears in ther.
er includes a toolbar with:
ser icons - Back, Forward, Refresh, Stop;
ar;
ge Web page tabs - New, Close, Next, Previous. The Allow alert box icon
lay or block JavaScript pop-ups (blocked by default in CWI).
R
the XPath Evaluator is to display the XPath expression of HTML elementsed (with a right-click) in the Web Browser. The Document field displays theing the XPath expression specified in the xPath field on the DOM Tree.
uator also contains icons for creating new CWI objects (statements, criteria,etc.) from the XPath currently displayed in the xPath field.
Evaluator
describes the fields and icons of the XPath Evaluator:
-
Chapter "Getting Familiar with Convertigo Web Integrator"CWI Studio
Table 1 - 11: XPath Evaluator elements
XPath Evaluator Elements Description
Field xPath Displays the XPath expression generated from the Dom Tree. XPath expressions can also be typed in directly and evaluated on the DOM Tree
Document Displays the results from running the XPath expression specified in the xPath field on the DOM Tree.
Icon Create new child screen class from current XPath icon.Launches the New Screen Class wizard.
Create new criterion from current XPath icon.Launches the New Criterion wizard.
Icon1 - 19
Create new extraction rule from current XPath icon.Launches the New Extraction Rule wizard.
Create new statement from current XPath icon.Launches the New Statement wizard. Available when a suitable handler or container is selected in the Projects View only.
Evaluate XPath icon.Evaluates the XPath expression specified in the xPath field on the DOM Tree. The result is displayed in the Document field.
Backward XPath History icon.Displays in the xPath field the previous XPath stored in the XPath history.
Forward XPath History icon.Displays in the xPath field the next XPath stored in the XPath history.
Set Anchor icon.Applicable after an XPath has been generated for a given node. Fixes this XPath and highlights it in yellow. Used to develop complex XPath from an anchored XPath.
-
1 - 20
Chapter "Getting Familiar with Convertigo Web Integrator" CWI StudioStarting with Convertigo Web Integrator - CEMS 5.2.0
-
2 My First Convertigo Web Integrator ProjectThis chapter give
Web Integrat
Setting Up th
2.1 Web Integr
This section dessample project fr
Project Descr
Opening the
2.1.1 Proje
The purpose of tthe Web page titrequest.
The process can
1 Connect to2 - 1
s instructions on how to set up a simple CWI project.
ion Project
e Project
ation Project
cribes the sample Web Integrating project and how to open the completedom the Studio:
iption
Sample Project from the Studio
ct Description
his Web integration project (called sample_doc_CWI) is to list in XML formatles and URLs returned by the Google search engine after sending a search
be summed up as follows:
the Google Web site at www.google.com.
-
2 - 2
Chapter "My First Convertigo Web Integrator Project" Web Integration Project
Figure 2 - 1: Connec
2 Key in secliplet)
Figure 2 - 2: Input o
3 Click on th
WperSSStarting with Convertigo Web Integrator - CEMS 5.2.0
tion to Google HTTP server
arch keywords in the Google search field (for example convertigo.
f a search keyword in Google search field
e Google Search button.
hen accessing the www.google.com Website, Google serversrovide Internet users with a localized version of the Google searchngine. This adds an extra step to the project, since CWI must then beedirected to the standard Google home page (see "Case of a Country-pecific Google Search Page - Definition of the googleSearchPageFRcreen Class" on page 2-21).
-
Chapter "My First Convertigo Web Integrator Project"Web Integration Project
Figure 2 - 3: Click o
4 Extract allresults.
Figure 2 - 4: Extract
5 Click on th2 - 3
n Google Search button
Web page titles and URLs (according to extraction rules) into an XML list of
ion of Web page titles and URLs in XML format
e Next link.
-
2 - 4
Chapter "My First Convertigo Web Integrator Project" Web Integration Project
Figure 2 - 5: Click o
6 Repeat ste
7 Stop proce
Figure 2 - 6: Proces
2.1.2 Open
The completed s
The following prwizard.
To open the sam1 In the StuStarting with Convertigo Web Integrator - CEMS 5.2.0
n the Next button
p 4 as long as a Next link is present on Google result pages.
ss when accessing a Google result page with no Next link:
s stops on Google page with no Next button
ing the Sample Project from the Studio
ample project is stored in the application installation folder.
ocedure describes how to access it from the Studio using the New Project
ple project from the Studiodio toolbar, click on New > Project:
-
Chapter "My First Convertigo Web Integrator Project"Setting Up the Project
Figure 2 - 7: Launching the New Project wizard
A New Pr
2 Expand CIntegrator
Figure 2 - 8: Selecti
3 Click on N
The samp
Figure 2 - 9: Sample
You can now sta
2.2 Setting Up
The following se
Creating a Pr
Defining Scre2 - 5
oject wizard opens.
onvertigo Projects, then Documentation Samples, and select Web:
ng the CWI sample project
ext, then Finish.
le project opens in the Projects View:
project in Projects View
rt setting up your own CWI project, sample_doc_CWI.
the Project
ctions explain step by step how to set up your first Web integration project:
oject and Setting Connector Parameters
en Classes
-
2 - 6
Chapter "My First Convertigo Web Integrator Project" Setting Up the Project
Defining a Transaction
Executing a Transaction
2.2.1 Creating a Project and Setting Connector ParametersThe first steps consist in creating a project and setting the connector parameters.
In this sample project:
the project is called sample_doc_CWI; the connector is an HTML connector called GoogleConnector accessing the
www.google.com Web site.
To create a pro1 Launch th
2 Select File
A New Pr
Figure 2 - 10: New P
3 Expand CStarting with Convertigo Web Integrator - CEMS 5.2.0
ject and set a connectore Convertigo Studio.
> New > Project or click on in the toolbar then select Project.
oject wizard opens:
roject wizard
onvertigo Projects, then Web Integration and select HTML Web Site:
-
Chapter "My First Convertigo Web Integrator Project"Setting Up the Project
Figure 2 - 11: Selec
4 Click on N
Figure 2 - 12: Enteri
5 In the Pro
6 Click on N
Iw2 - 7
ting a project type
ext:
ng a project name
jects name field, type in the project name (for example sample_doc_CWI).ext:
n this example, HTML Web Site is selected because the application tohich CWI connects is an HTML application.
-
2 - 8
Chapter "My First Convertigo Web Integrator Project" Setting Up the Project
Figure 2 - 13: Enteri
The Consample_dGoogleCo
7 Click on N
Figure 2 - 14: Settin
8 Enter the
in the www.g
in the H
check Starting with Convertigo Web Integrator - CEMS 5.2.0
ng a connector name
nector name field is filled by default with the connector nameoc_CWIConnector. You can choose another name (for examplennector).ext:
g HTTP or HTML connector parameters
required information in the Target Server section:
HTTP server field, type in the target Web site URL address (for exampleoogle.com);TTP Port field, type in the HTTP port if required (left blank here);
the SSL checkbox to connect to a secured HTTP server (HTTPS server).
-
Chapter "My First Convertigo Web Integrator Project"Setting Up the Project
9 If youre using a proxy server to access the Web, enter the required information in theProxy Server section (Proxy Server address, Proxy Port and SSL).
10 Click on Next.
The New project summary window opens:
Figure 2 - 15: New p
11 Check thaProjects V
Figure 2 - 16: samp
As we can see, respective folder
The HTML pagethe Connector V2 - 9
roject summary window
t all settings are correct, then click on Finish. The project now appears in theiew:
leGoogle project appearing in the Projects View
one screen class and one transaction have been created by default in theirs (Screen Classes and Transactions).
of the HTTP server specified as Target Server opens in the Web Browser ofiew, with its DOM Tree to the left:
-
2 - 10
Chapter "My First Convertigo Web Integrator Project" Setting Up the Project
Figure 2 - 17: DOM
This is the startin
The following stecriteria (in other unique manner).
When CWI accematch page elemtrigger a number
2.2.2 Defin
To define a screconsidered sufficdetection criterio
1 Selecting
2 Finding its
3 Generatin
4 Creating t
ROOT SCREEN C
In our example pscreen class (cageneral Google W
Web pages onlyThis means thatStarting with Convertigo Web Integrator - CEMS 5.2.0
Tree and Web Browser
g point of the project.
ps will consist in defining a set of screen classes and in setting their detectionwords, HTML elements - text, links, logos, etc. - defining a Web screen in a
sses a page, all criteria are checked; if all criteria defined for a screen classents, then CWI associates the page to this screen class. It is then able to
of statements (Defining a Transaction on page 2 - 40).
ing Screen Classes
en class, you must first find which element of the accessed Web page can beiently relevant to serve as a detection criterion. Using a screen element as an for a screen class means:
the element on the screen (right-click).
most accurate nodes in the DOM Tree.
g its XPath.
he corresponding screen class from the generated XPath expression.
LASS
roject, the first and most general screen class that we want to define as rootlled googleWebPages in this project) is the screen class representing aeb page.
(not images, videos, maps, etc.) are to be returned throughout our search. the Web item must be present on the upper left corner of the Google page,
-
Chapter "My First Convertigo Web Integrator Project"Setting Up the Project
and that this item must be plain text (as opposed to hypertext) (see Figure 2 - 18):
Figure 2 - 18: Goog
This item will ser
The following pro
1 select a googleWe
2 generate t
3 create the
To find a criteri1 Right-click
The elemehighlighted
Figure 2 - 19: Detec2 - 11
le general search page
ve as detection criterion for the root screen class.
cedure explains how to:
detection criterion for a screen class (here, the parent screen classbPages);
he corresponding XPath;
screen class from the generated XPath expression.
on and create a screen class on the word "Web" in the upper left corner of the Google search page.
nt is outlined in green in the Web Browser and its corresponding element is in green in the DOM Tree:
tion criterion of the googleWebPages parent screen class
-
2 - 12
Chapter "My First Convertigo Web Integrator Project" Setting Up the Project
2 Expand the B element in the Dom Tree:
Figure 2 - 20: Expan
Two noderepresentscontent.
We will se
3 While keeTree with
4 Press En(Enter).
Figure 2 - 21: Gene
An XPath Starting with Convertigo Web Integrator - CEMS 5.2.0
ding an element to find relevant nodes and attributes in the DOM Tree
s are associated with the B element: an attribute node (class="gb1"), which the element class, and a text node (Web), which represents the element
lect both and generate the XPath from the selection.
ping the Shift button pressed, select class="gb1" and Web in the DOMa left-click.
ter or right-click on the selection and select Generate selection XPath
rating XPath from class and text attributes
expression corresponding to the selected HTML element (i.e. "Web" item) and
-
Chapter "My First Convertigo Web Integrator Project"Setting Up the Project
based on the two nodes selected as detection criteria for the googleWebPages screenclass appears in the XPath Evaluator:
Figure 2 - 22: XPath expression of the detection criterion of the googleWebPages screen class
The resultwindow.
We will ngoogleWe
5 In the Prothe new sc
6 In the XP
XPath ico
A New Sc
Figure 2 - 23: New S
7 Click on S2 - 13
s from running the XPath expression on the DOM appears in the Document
ow create, based on this XPath, the corresponding screen class calledbPages:jects View, select the screen class that we want as parent screen class forreen class (Default_Screen_Class).
ath Evaluator, click on the Create new child screen class from current
n .
reen Class wizard opens:
creen Class wizard
creen class, then on Next:
-
2 - 14
Chapter "My First Convertigo Web Integrator Project" Setting Up the Project
Figure 2 - 24: Giving
8 Type in th
9 Click on F
The screeDefault_
Figure 2 - 25: New s
The criterappears in
10 Save your
To check if dete1 Assuming
Show cur
In the Proshowing thWeb page
2 Type in a
RStarting with Convertigo Web Integrator - CEMS 5.2.0
a name to the new screen class
e name of the new screen class (for example googleWebPages).inish.
n class now appears in the Projects View as an inherited screen class of theScreen_class:
creen class appearing in the Projects View
ion previously defined for the screen class is automatically generated and the related Criteria folder.
project by clicking on or by pressing Ctrl + S.
ction criteria are correct for this screen class that the Web Browser currently displays a Google search page, click on the
rent screen class icon in the Connector View toolbar.
jects View, the googleWebPages screen class is automatically highlighted,at detection criteria identifying this screen class do all match on the accessed.
keyword (for example convertigo cliplet).
emember to save your project on a regular basis.
-
Chapter "My First Convertigo Web Integrator Project"Setting Up the Project
3 Click on the Google Search button.
Results are displayed in a Google result page.
4 In the Connector View toolbar, click on the Show current screen class icon .
As in the first step, the googleWebPages screen class is automatically highlighted inthe Projects View.
We can infere from this test that the detection criterion identifying Google Web pages iswell set. Indeed, each type of Google Web page (result and search pages) which will beaccessed throughout the transaction is detected by CWI as belonging to thegoogleWebPages screen class.
ADDITIONAL SC
The following ste
one correspo
one correspo
Those two additthey are Google is a search page
They automaticacriteria are then
To create additiogoogleWebPag1 Selection
2 Generatio
3 Creation o
GOOGLESEARCHP
To create the go1 Right-click
Browser:
FhgSoEC2 - 15
REEN CLASSES
p consists in creating two additional scren classes:
nding to the Google search page, called googleSearchPage;nding to Google result pages, called googleResultsPage.
ional screen classes are children screen classes of the googleWebPages:Web pages, but they also present specificities (one is a result page, the other).
lly inherits the googleWebPages detection criteria. Additional detectionadded to their definition to differentiate them.
nal screen classes, the procedure is identical to the one used to create thee screen class:of a detection criterion.
n of XPath.
f a new child screen class.
AGE SCREEN CLASS
ogleSearchPage screen class on the Google logo in the Google search page displayed in the Web
or projects set in countries for which a country-specific Googleomepage exists, a third screen class must be defined :oogleSearchPageFR (see "Case of a Country-Specific Googleearch Page - Definition of the googleSearchPageFR Screen Class"n page 2-21). This screen class is detected using the "Google.com innglish" link of country-specific Google homepages. It is used to redirectWI back to the English Google search page.
-
2 - 16
Chapter "My First Convertigo Web Integrator Project" Setting Up the Project
Figure 2 - 26: Selec
2 Expand th
Figure 2 - 27: Attribuclass
An id attribute nunique manner, selement.
TpStarting with Convertigo Web Integrator - CEMS 5.2.0
tion of a detection criterion for the googleSearchPage screen class
e DOM Tree to select the most relevant node for this HTML element:
te nodes of the HTML element selected as detection criterion for the googleSearchPages screen
ode appears in the tree. This attribute always define an HTML element in ao this is the best candidate for generating the XPath corresponding to the IMG
his detection criterion has been selected because the Google searchage does include a Google logo, whereas result pages do not.
-
Chapter "My First Convertigo Web Integrator Project"Setting Up the Project
3 Right-click the id="logo" attribute node and select Generate selection XPath(Enter) or select the attribute node with a left-click and press Enter.
The XPath
4 Select the
5 In the XP
XPath ico
The New
6 Click on S
7 Type in th
8 Click on F
The screegoogleWe
Figure 2 - 28: googl
The criterion prein the related Cri
9 Save your
The next step coresult pages.
When present, the NAME attribute is also a good candidate. On thecontrary, avoid using attribute nodes such as SRC and HREF, which arenever good candidates for generating XPath corresponding to detectioncriteria (their value often changes).
You can also right-click the parent IMG element and select Generateselection XPath (Enter) or select the IMG element with a left-click andpress Enter: the XPath will intelligently be generated with the idattribute, since it exists.
R2 - 17
expression of the selected node appears in the XPath Evaluator.
parent screen class, googleWebPages.ath Evaluator, click on the Create new child screen class from current
n .
Screen Class wizard opens.
creen class, then on Next.
e name of the new screen class (for example googleSearchPage).inish.
n class now appears in the Projects View as an inherited screen class of thebPages screen class:
eSearchPage screen class in Inherited screen classes folder
viously defined for this screen class is automatically generated and appearsteria folder. Criterion inherited from the parent screen class appears greyed.
project by clicking on or by pressing Ctrl + S.
nsists in creating the googleResultsPage screen class identifying Google
emember to save your project on a regular basis.
-
2 - 18
Chapter "My First Convertigo Web Integrator Project" Setting Up the Project
GOOGLERESULTSPAGE SCREEN CLASS
We will now launch a Google search with the keywords convertigo cliplet and define ascreen class for result pages.
To check that there will not be any conflict between the detection criteria of the different screenclasses, we can launch the search in the Web Browser and evaluate the XPath previouslydefined using the XPath Evaluator.
To check that the criterion is a unique identifier for a screen class1 Type in convertigo cliplet in the Google search field of the Web Browser.2 Once results have been displayed in the browser, type in the previous XPath (//
IMG[@id=
3 Click on
The result
Figure 2 - 29: googl
No IMG elthe page googleSeclass.
To create the go1 In the Web
Figure 2 - 30: SelecStarting with Convertigo Web Integrator - CEMS 5.2.0
"logo"]) in the xPath field of the XPath Evaluator.
in the XPath Evaluator.
of the evaluation is displayed in the Document field of the XPath Evaluator:
eSearchPage XPath evaluation on a Google result page
ement with an id="logo" attribute was found in the document tree model ofcurrently displayed in the Web Browser. The criterion selected for thearchPage screen class is therefore a unique differentiator for this screen
ogleResultsPage screen class Browser, right-click on the right of the first result:
tion of a detection criterion for the googleResultsPage screen class
-
Chapter "My First Convertigo Web Integrator Project"Setting Up the Project
The LI container is highlighted in the DOM Tree. This container contains all nodesmodeling the first result returned by the Google search engine. It does not include anyattribute that could possibly serve as strong differentiator for the screen class. We mustnow look in parent nodes to find one.
2 To do so, click on to select the parent node.
An OL con
This elemcontent th
3 Click on
A DIV conthis node
4 Click on
Another D
Figure 2 - 31: Searc
This nodean elemen
5 Right-clickor select t
Searching for a detection criteria is a sampling process that requiresselecting different elements on the page to select the best suited fordetecting a screen class.2 - 19
tainer is outlined in green.
ent is useless for detection because it does not contain any attribute or textat could serve to generate an accurate XPath for the selected HTML element.
.
tainer is outlined in green. For the same reasons as previously mentioned,cannot be used to generate an accurate XPath.
.
IV container is highlighted in green:
hing for a better detection criterion
contains an id attribute, which can be used as unique attribute for generatingts XPath, as mentioned in the definition of the previous screen class.
the id="res" attribute node and select Generate selection XPath (Enter)he attribute node with a left-click and press Enter:
-
2 - 20
Chapter "My First Convertigo Web Integrator Project" Setting Up the Project
Figure 2 - 32: Generation of XPath for the googleResultsPage screen class
The XPath
6 Select the
7 In the XP
XPath ico
The New
8 Click on S
9 Type in th
10 Click on F
The screegoogleWe
Figure 2 - 33: googl
The criterappears inappears g
The main screen
YspaStarting with Convertigo Web Integrator - CEMS 5.2.0
expression of the selected node appears in the XPath Evaluator.
parent screen class, googleWebPages.ath Evaluator, click on the Create new child screen class from current
n .
Screen Class wizard opens.
creen class, then on Next.
e name of the new screen class (for example googleResultsPage).inish.
n class now appears in the Projects View as an inherited screen class of thebPages screen class:
eResultsPage screen class in Inherited screen classes folder
ion previously defined for the screen class is automatically generated and the related Criteria folder. Criterion inherited from the parent screen class
reyed.
classes have been created.
ou can also right-click the parent DIV element and select Generateelection XPath (Enter) or select the IMG element with a left-click andress Enter: the XPath will intelligently be generated with the idttribute, since it exists.
-
Chapter "My First Convertigo Web Integrator Project"Setting Up the Project
As mentioned before, Google servers automatically redirect Internet users to country-specificGoogle search pages. In this case, you need to create a country-specific screen class, whichis detected when accessing the country-specific Google search page.
The following sub-section describes how to create such screen class.
CASE OF A COUgoogleSearcWhen CWI connexists, it automa
Country-specificredirecting hyper
a Google.co
Figure 2 - 34: Exam
a Go to Goo
As we develop transactions, additional screen classes will be needed inthe project.
The following extra screen class is needed only if you access thewbG2 - 21
NTRY-SPECIFIC GOOGLE SEARCH PAGE - DEFINITION OF THE hPageFR SCREEN CLASSects to the www.google.com Web site, if a Google country-specific pagetically opens in the browser.
Google homepages can be defined as standard Google search pages with atext link in the lower right corner of pages. This link can be:
m in English link in non-anglophone country specific pages:
ple of country-specific non-anglophone Google search page (France)
gle.com link in anglophone country-specific pages:
ww.google.com Web site from outside the USA and need CWI toe redirected from your country-specific Google page to theoogle.com page in English.
-
2 - 22
Chapter "My First Convertigo Web Integrator Project" Setting Up the Project
Figure 2 - 35: Exam
To detect coungoogleSearchas criteria alread
The country-spe
it is a child sccriteria:
"Web" link
Google log it is character
Google.com
To create a cou1 In the cou
on the red
The link ihighlightedStarting with Convertigo Web Integrator - CEMS 5.2.0
ple of country-specific anglophone Google search page (UK)
try-specific pages, we will define a country-specific screen class (calledPageFR in our project) having as detection criteria the redirecting link as welly defined for the googleSearchPage screen class.
cific screen class can be defined as follows:
reen class of the googleSearchPage screen class and, as such, inherits its
in upper left corner of the Google page;
o.
ized by the presence of a redirecting link (Google.com in English or Go to) in the lower right corner of the page.
ntry-specific (googleSearchPageFR) screen classntry-specific Google search page displayed in the Web Browser, right-clickirecting link (Google.com in English or Go to Google.com).
s outlined in green in the Web Browser and its corresponding element is in green in the DOM Tree:
-
Chapter "My First Convertigo Web Integrator Project"Setting Up the Project
Figure 2 - 36: Selec
We will exthe XPath
2 Expand th
Figure 2 - 37: Expan
Two nodethe URL aGo to Go
The hrefoften chan
3 Right-click
Generate
The XPath2 - 23
tion of a detection criterion for the country-specific screen class
pand the A element node to see if it contains interesting attributes to generate corresponding to the link.
e A element in the DOM Tree by clicking on :
ding an element to find relevant attributes in the DOM Tree for the country-specific screen class
s are associated with the A element: an href attribute node, which containsddress of the hypertext link, and a text node (Google.com in English orogle.com), which represents the element text content.
attribute is not interesting as regards XPath generation because its value mayge. We will select only the text content and generate the XPath from it.
the text node or and select
selection XPath (Enter) or select it with a left-click and press Enter.
expression of the selected node appears in the XPath Evaluator:
-
2 - 24
Chapter "My First Convertigo Web Integrator Project" Setting Up the Project
Figure 2 - 38: XPath expression of country-specific redirecting link
4 Select the parent screen class, googleSearchPage.5 In the XPath Evaluator, click on the Create new child screen class from current
XPath ico
The New
6 Click on S
7 Type in th
8 Click on F
The screegoogleSe
Figure 2 - 39: googl
The criterappears inappear gre
9 Save your
The next steps n
1 Creating t
2 Defining a
2.2.3 Creat
This step shows
The logical proceStarting with Convertigo Web Integrator - CEMS 5.2.0
n .
Screen Class opens.
creen class, then on Next.
e name of the new screen class (for example googleSearchPageFR).inish.
n class now appears in the Projects View as an inherited screen class of thearchPage screen class:
eSearchPageFR screen class in Projects View
ion previously defined for the screen class is automatically generated and the related Criteria folder. Criteria inherited from the parent screen classyed.
project by clicking on or by pressing Ctrl + S.
ow consist in:
he extraction rule associated with the googleResultsPage screen class. transaction.
ing an Extraction Rule
how to create a new extraction rule associated with a given screen class.
ss for creating an extraction rule is the following:
-
Chapter "My First Convertigo Web Integrator Project"Setting Up the Project
1 Generate the root XPath of the extraction rule in the XPath Evaluator - HTMLelements (data) to be extracted will be localized using relative XPaths based on thisroot XPath. We will choose as root XPath the googleResultsPage criterions XPath,(//DIV[@id="res"]), which addresses the results container in the Web page.
2 Create a new extraction rule by launching the New Extraction Rule wizard and choosean extraction rule template. In this sample project, the extraction rule template is Table. This is the best-suited formfor creating a list (called listResults) of Web page titles and URLs. This table canbe defined as follows:
3 set extracIn this pro
set row
set colView).
To create an ex1 Type in t
Evaluator
2 Click on th
Evaluator
A New Ex
Table 2 - 1: Table extraction rule - line and column description
Table eleme
Line
Column
Ywc
Yg2 - 25
tion rule properties in the Properties View. ject:
name and row XPath (Description property in the Properties View).
umn names and column XPaths (Description property in the Properties
traction rulehe root XPath (//DIV[@id="res"]) in the xPath field of the XPath
e Create new extraction rule from current XPath icon in the XPath
.
traction Rule wizard opens:
nt Name Description
resultItem Each table row corresponds to a Google result.
title and url Two columns per row - corresponding to Web page titles and URLs
ou can also choose to start by launching the New Extraction Ruleizard after having right-clicked on the relevant screen class. In thisase, you will have to set the root XPath property in the Properties View.
ou can also generate the root XPath from the DOM Tree (To create theoogleResultsPage screen class on page 2 - 18, steps 1 to 4).
-
2 - 26
Chapter "My First Convertigo Web Integrator Project" Setting Up the Project
Figure 2 - 40: New E
3 Click on T
4 Click on N
Figure 2 - 41: Giving
5 In the Nam
6 Click on NStarting with Convertigo Web Integrator - CEMS 5.2.0
xtraction Rule wizard
able.
ext:
a name to the new extraction rule
e field, enter a name (for example googleResults).ext:
-
Chapter "My First Convertigo Web Integrator Project"Setting Up the Project
Figure 2 - 42: HTML
This windobeen set aprior to lauis not the
7 Click on F
The new e
Figure 2 - 43: googl
8 Select the2 - 27
tables definition
w is a help to setting properties. It recalls the root XPath that has previouslynd displays parameters of HTML tables. It is used only if the XPath specifiednching the New Extraction Rule wizard corresponds to a HTML table. This
case in our project.
inish.
xtraction rule appears in the related Extraction rules folder:
eResults extraction rule in Extraction rules folder
extraction rule, its properties are displayed in the Properties View:
-
2 - 28
Chapter "My First Convertigo Web Integrator Project" Setting Up the Project
Figure 2 - 44: Extrac
We must now se
9 Set the Ta
10 If the XPa
W