Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf ·...

68
DENODO ITPILOT 4.6 DEVELOPER GUIDE Update Aug 16 th , 2011

Transcript of Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf ·...

Page 1: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

DENODO ITPILOT 46 DEVELOPER GUIDE

Update Aug 16th 2011

NOTE This document is confidential and is the property of Denodo Technologies (hereinafter Denodo) No part of the document may be copied photographed transmitted electronically stored in a document management system or reproduced by any other means without prior written permission from Denodo

Copyright 2011 This document may not be reproduced in total or in part without written permission from Denodo Technologies

ITPilot 46 Developer Guide

INDEX

PREFACE I SCOPE I WHO SHOULD USE THIS DOCUMENT I SUMMARY OF CONTENTS I

1 INTRODUCTION 2

2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES 3 21 WEB SERVICE TYPES 3 22 INVOKING SOAP WEB SERVICES 3 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES 3

231 HTML Output Configuration 4 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES 5

3 ITPILOT DEVELOPMENT API 7 31 CONNECTING TO THE SERVER 7 32 OBTAINING WRAPPERS 8 33 USING WRAPPERS 8 34 PROCESSING QUERY RESULTS 9

341 Canceling Queries 11 35 EXAMPLE OF USE 11

4 CREATING CUSTOM ITPILOT FUNCTIONS 14 41 NAMING CONVENTIONS AND ANNOTATIONS 15 42 COMPOUND TYPES 15 43 PAGE TYPE 16 44 CUSTOM FUNCTION RETURN TYPE 16 45 EXAMPLE 17

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT 18 51 INTRODUCTION 18 52 REPRESENTATION FORMAT OF A WRAPPER 18

521 Initialization of Searchable Parameters 19 522 Main Function 19 523 Generating the Output Structure 19

53 PREDEFINED ITPILOT COMPONENT GUIDE 19 531 Introduction 19 532 Data Structures 19 533 Common functions 22 534 Add Record To List 24 535 Condition 25 536 Create List 26 537 Create Persistent Browser 27 538 Diff 28 539 ExecuteJS 30 5310 Expression 31 5311 Extractor 32

ITPilot 46 Developer Guide

5312 Fetch 33 5313 Filter 35 5314 Form Iterator 36 5315 Get Page 40 5316 Init 41 5317 Iterator 45 5318 JDBCExtractor 46 5319 Loop 48 5320 Next Interval Iterator 49 5321 Output 51 5322 Record Constructor 52 5323 Record Sequence or Extractor Sequence 53 5324 Release Persistent Browser 54 5325 Repeat 55 5326 Script 56 5327 Sequence 57 5328 Store File 59 5329 Thread 60

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS 61 541 Developing Custom Components 61 542 Using Custom Components 62

55 WRAPPER DEVELOPMENT 62

REFERENCES 63

ITPilot 46 Developer Guide

FIGURES Figure 1 Example of query execution to a wrapper 13 Figure 2 ITPilot Custom Function Sample 17 Figure 3 ITPilot Wrapper Skeleton in JavaScript 18 Figure 4 Using the ExecuteJS NSEQL command 30 Figure 5 Using threads in the Iterator component 45 Figure 6 Using the Loop function 48 Figure 7 Using the Repeat function 55 Figure 8 Using custom components from JavaScript 62

ITPilot 46 Developer Guide

Preface i

PREFACE

SCOPE

Denodo ITPilot enables easy access to and extraction of data from semi-structured Web data sources This document is an introduction to application development using wrappers created by Denodo ITPilot

WHO SHOULD USE THIS DOCUMENT

This document is aimed at developers that want to gain an insight into how applications are developed that make best use of the advanced automation and Web data extraction functionalities provided by Denodo ITPilot The exact detailed information required to install the system and manage is provided in other manuals to which reference will be made as the need arises

SUMMARY OF CONTENTS

More specifically this document

bull Presents the fundamental steps needed to develop an application that uses the wrappers generated by Denodo ITPilot

bull Describes the task of exporting and deploying a wrapper as a Web Service

bull Gives a detailed description of how to use the development API offered by Denodo ITPilot

bull Provides an example of how to develop an application that uses a wrapper installed in a Denodo ITPilot execution server

bull Details how to create custom ITPilot functions

bull Explains how to develop wrappers by using the ITPilot JavaScript components

ITPilot 46 Developer Guide

Introduction 2

1 INTRODUCTION

Denodo ITPilot is a Denodo Technologies solution that enables to extract and structure the data present in Web sources This process is carried out by constructing an abstraction of the target Web source called a ldquowrapperrdquo that frees the client applications of the difficulties associated with accessing and extracting the required data ITPilot provides a distributed and scalable environment for generating executing and maintaining ldquowrappersrdquo See [USER] and [GENER] for more information on how to create install and maintain wrappers using Denodo ITPilot This manual describes the JAVA development API that allows creating clients that use wrappers that have already been generated and installed The basic guidelines for using the API are given the main components are described and some examples of use are provided See Javadoc documentation [JDOC] for more details on classes attributes and operations Besides this manual explains how to access wrappers through Web Services exported in the execution environment

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 3

2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES

The wrappers saved in the execution server can be invoked in two different ways Firstly the native ITPilot Java API can be used to access the wrappers obtain their data structure and run queries on them from a Java application Their description can be found in section 3 Another option is to expose these wrappers through Web Services This latter option is described in this section A Web Service containing the following operations can be generated for a particular wrapper

bull An operation containing all searchable and compulsory parameters

bull Optionally another operation with all searchable and compulsory parameters plus any searchable and optional parameters selected in the Web Service generation process (this process is defined in [USER])

The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server

21 WEB SERVICE TYPES

ITPilot allows one wrapper to be published as a Web Service to enable use by any external application The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server The types of Web services that ITPilot can publish are

bull SOAP [SOAP] Web Services bull REST-style Web Services that use HTTP directly as the transport protocol and return data encoded in XML bull HTML Web Services Similar to the REST-style Web services but the output consists of an HTML table

containing the response data for the query executed The table includes JavaScript code to sort the results by any field andor paginate the returned results It is also possible to adjust the size of the table and the cells and to modify its graphic appearance using a CSS file

The following section describes the querying process for these Web Services

22 INVOKING SOAP WEB SERVICES

The SOAP version of the published Web Services can be accessed by using any Web Service client or client generator that meets SOAP12 [SOAP] and WSDL 11 [WSDL] standards such as the Apache Axis wsdl2java [AXIS] or NET Framework wsdl [DOTNET] tools The WSDL from which the clients are generated can be obtained either from the local file created by ITPilot or through the access URL to the Web Service WSDL httpltdomaingtltportgtltservice_namegtservicesltservice_namegtwsdl ITPilot distribution in the samplesitpilotitp-clients directory contains a sample client generated using Apache Axis The README file residing in this path contains detailed information on how to generate compile and run the files comprising the client application

23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES

This section describes how to invoke the REST and HTML versions of the Web Services that have been published by DataPort once they have been deployed in the Web Service container Once the war file has been deployed in the J2EE application server the relative paths rest and html of the webapp show an information screen of the respective Web service version which shows the available operations

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 4

Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be

httpacme9090testWSrest httpacme9090testWShtml

For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used

httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA

httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following

httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen

where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions

httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA

If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing

httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1

231 HTML Output Configuration

The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows

bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 5

bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval

bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected

bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines

bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar

bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added

bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added

These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format

httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen

For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10

24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES

When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool

1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo

ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 6

3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established

ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

ITPilot Development API 7

3 ITPILOT DEVELOPMENT API

Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]

31 CONNECTING TO THE SERVER

There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)

In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access

ITPilot 46 Developer Guide

ITPilot Development API 8

32 OBTAINING WRAPPERS

As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it

bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained

bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter

bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server

bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server

bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server

bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server

33 USING WRAPPERS

Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method

HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method

HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date

ITPilot 46 Developer Guide

ITPilot Development API 9

Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods

void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)

allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot

34 PROCESSING QUERY RESULTS

The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)

ITPilot 46 Developer Guide

ITPilot Development API 10

The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator

bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not

bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method

ITPilot 46 Developer Guide

ITPilot Development API 11

341 Canceling Queries

The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query

void cancel()

35 EXAMPLE OF USE

This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section

TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 2: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

NOTE This document is confidential and is the property of Denodo Technologies (hereinafter Denodo) No part of the document may be copied photographed transmitted electronically stored in a document management system or reproduced by any other means without prior written permission from Denodo

Copyright 2011 This document may not be reproduced in total or in part without written permission from Denodo Technologies

ITPilot 46 Developer Guide

INDEX

PREFACE I SCOPE I WHO SHOULD USE THIS DOCUMENT I SUMMARY OF CONTENTS I

1 INTRODUCTION 2

2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES 3 21 WEB SERVICE TYPES 3 22 INVOKING SOAP WEB SERVICES 3 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES 3

231 HTML Output Configuration 4 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES 5

3 ITPILOT DEVELOPMENT API 7 31 CONNECTING TO THE SERVER 7 32 OBTAINING WRAPPERS 8 33 USING WRAPPERS 8 34 PROCESSING QUERY RESULTS 9

341 Canceling Queries 11 35 EXAMPLE OF USE 11

4 CREATING CUSTOM ITPILOT FUNCTIONS 14 41 NAMING CONVENTIONS AND ANNOTATIONS 15 42 COMPOUND TYPES 15 43 PAGE TYPE 16 44 CUSTOM FUNCTION RETURN TYPE 16 45 EXAMPLE 17

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT 18 51 INTRODUCTION 18 52 REPRESENTATION FORMAT OF A WRAPPER 18

521 Initialization of Searchable Parameters 19 522 Main Function 19 523 Generating the Output Structure 19

53 PREDEFINED ITPILOT COMPONENT GUIDE 19 531 Introduction 19 532 Data Structures 19 533 Common functions 22 534 Add Record To List 24 535 Condition 25 536 Create List 26 537 Create Persistent Browser 27 538 Diff 28 539 ExecuteJS 30 5310 Expression 31 5311 Extractor 32

ITPilot 46 Developer Guide

5312 Fetch 33 5313 Filter 35 5314 Form Iterator 36 5315 Get Page 40 5316 Init 41 5317 Iterator 45 5318 JDBCExtractor 46 5319 Loop 48 5320 Next Interval Iterator 49 5321 Output 51 5322 Record Constructor 52 5323 Record Sequence or Extractor Sequence 53 5324 Release Persistent Browser 54 5325 Repeat 55 5326 Script 56 5327 Sequence 57 5328 Store File 59 5329 Thread 60

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS 61 541 Developing Custom Components 61 542 Using Custom Components 62

55 WRAPPER DEVELOPMENT 62

REFERENCES 63

ITPilot 46 Developer Guide

FIGURES Figure 1 Example of query execution to a wrapper 13 Figure 2 ITPilot Custom Function Sample 17 Figure 3 ITPilot Wrapper Skeleton in JavaScript 18 Figure 4 Using the ExecuteJS NSEQL command 30 Figure 5 Using threads in the Iterator component 45 Figure 6 Using the Loop function 48 Figure 7 Using the Repeat function 55 Figure 8 Using custom components from JavaScript 62

ITPilot 46 Developer Guide

Preface i

PREFACE

SCOPE

Denodo ITPilot enables easy access to and extraction of data from semi-structured Web data sources This document is an introduction to application development using wrappers created by Denodo ITPilot

WHO SHOULD USE THIS DOCUMENT

This document is aimed at developers that want to gain an insight into how applications are developed that make best use of the advanced automation and Web data extraction functionalities provided by Denodo ITPilot The exact detailed information required to install the system and manage is provided in other manuals to which reference will be made as the need arises

SUMMARY OF CONTENTS

More specifically this document

bull Presents the fundamental steps needed to develop an application that uses the wrappers generated by Denodo ITPilot

bull Describes the task of exporting and deploying a wrapper as a Web Service

bull Gives a detailed description of how to use the development API offered by Denodo ITPilot

bull Provides an example of how to develop an application that uses a wrapper installed in a Denodo ITPilot execution server

bull Details how to create custom ITPilot functions

bull Explains how to develop wrappers by using the ITPilot JavaScript components

ITPilot 46 Developer Guide

Introduction 2

1 INTRODUCTION

Denodo ITPilot is a Denodo Technologies solution that enables to extract and structure the data present in Web sources This process is carried out by constructing an abstraction of the target Web source called a ldquowrapperrdquo that frees the client applications of the difficulties associated with accessing and extracting the required data ITPilot provides a distributed and scalable environment for generating executing and maintaining ldquowrappersrdquo See [USER] and [GENER] for more information on how to create install and maintain wrappers using Denodo ITPilot This manual describes the JAVA development API that allows creating clients that use wrappers that have already been generated and installed The basic guidelines for using the API are given the main components are described and some examples of use are provided See Javadoc documentation [JDOC] for more details on classes attributes and operations Besides this manual explains how to access wrappers through Web Services exported in the execution environment

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 3

2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES

The wrappers saved in the execution server can be invoked in two different ways Firstly the native ITPilot Java API can be used to access the wrappers obtain their data structure and run queries on them from a Java application Their description can be found in section 3 Another option is to expose these wrappers through Web Services This latter option is described in this section A Web Service containing the following operations can be generated for a particular wrapper

bull An operation containing all searchable and compulsory parameters

bull Optionally another operation with all searchable and compulsory parameters plus any searchable and optional parameters selected in the Web Service generation process (this process is defined in [USER])

The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server

21 WEB SERVICE TYPES

ITPilot allows one wrapper to be published as a Web Service to enable use by any external application The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server The types of Web services that ITPilot can publish are

bull SOAP [SOAP] Web Services bull REST-style Web Services that use HTTP directly as the transport protocol and return data encoded in XML bull HTML Web Services Similar to the REST-style Web services but the output consists of an HTML table

containing the response data for the query executed The table includes JavaScript code to sort the results by any field andor paginate the returned results It is also possible to adjust the size of the table and the cells and to modify its graphic appearance using a CSS file

The following section describes the querying process for these Web Services

22 INVOKING SOAP WEB SERVICES

The SOAP version of the published Web Services can be accessed by using any Web Service client or client generator that meets SOAP12 [SOAP] and WSDL 11 [WSDL] standards such as the Apache Axis wsdl2java [AXIS] or NET Framework wsdl [DOTNET] tools The WSDL from which the clients are generated can be obtained either from the local file created by ITPilot or through the access URL to the Web Service WSDL httpltdomaingtltportgtltservice_namegtservicesltservice_namegtwsdl ITPilot distribution in the samplesitpilotitp-clients directory contains a sample client generated using Apache Axis The README file residing in this path contains detailed information on how to generate compile and run the files comprising the client application

23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES

This section describes how to invoke the REST and HTML versions of the Web Services that have been published by DataPort once they have been deployed in the Web Service container Once the war file has been deployed in the J2EE application server the relative paths rest and html of the webapp show an information screen of the respective Web service version which shows the available operations

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 4

Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be

httpacme9090testWSrest httpacme9090testWShtml

For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used

httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA

httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following

httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen

where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions

httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA

If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing

httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1

231 HTML Output Configuration

The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows

bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 5

bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval

bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected

bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines

bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar

bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added

bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added

These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format

httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen

For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10

24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES

When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool

1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo

ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 6

3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established

ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

ITPilot Development API 7

3 ITPILOT DEVELOPMENT API

Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]

31 CONNECTING TO THE SERVER

There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)

In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access

ITPilot 46 Developer Guide

ITPilot Development API 8

32 OBTAINING WRAPPERS

As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it

bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained

bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter

bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server

bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server

bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server

bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server

33 USING WRAPPERS

Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method

HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method

HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date

ITPilot 46 Developer Guide

ITPilot Development API 9

Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods

void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)

allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot

34 PROCESSING QUERY RESULTS

The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)

ITPilot 46 Developer Guide

ITPilot Development API 10

The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator

bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not

bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method

ITPilot 46 Developer Guide

ITPilot Development API 11

341 Canceling Queries

The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query

void cancel()

35 EXAMPLE OF USE

This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section

TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 3: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

INDEX

PREFACE I SCOPE I WHO SHOULD USE THIS DOCUMENT I SUMMARY OF CONTENTS I

1 INTRODUCTION 2

2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES 3 21 WEB SERVICE TYPES 3 22 INVOKING SOAP WEB SERVICES 3 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES 3

231 HTML Output Configuration 4 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES 5

3 ITPILOT DEVELOPMENT API 7 31 CONNECTING TO THE SERVER 7 32 OBTAINING WRAPPERS 8 33 USING WRAPPERS 8 34 PROCESSING QUERY RESULTS 9

341 Canceling Queries 11 35 EXAMPLE OF USE 11

4 CREATING CUSTOM ITPILOT FUNCTIONS 14 41 NAMING CONVENTIONS AND ANNOTATIONS 15 42 COMPOUND TYPES 15 43 PAGE TYPE 16 44 CUSTOM FUNCTION RETURN TYPE 16 45 EXAMPLE 17

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT 18 51 INTRODUCTION 18 52 REPRESENTATION FORMAT OF A WRAPPER 18

521 Initialization of Searchable Parameters 19 522 Main Function 19 523 Generating the Output Structure 19

53 PREDEFINED ITPILOT COMPONENT GUIDE 19 531 Introduction 19 532 Data Structures 19 533 Common functions 22 534 Add Record To List 24 535 Condition 25 536 Create List 26 537 Create Persistent Browser 27 538 Diff 28 539 ExecuteJS 30 5310 Expression 31 5311 Extractor 32

ITPilot 46 Developer Guide

5312 Fetch 33 5313 Filter 35 5314 Form Iterator 36 5315 Get Page 40 5316 Init 41 5317 Iterator 45 5318 JDBCExtractor 46 5319 Loop 48 5320 Next Interval Iterator 49 5321 Output 51 5322 Record Constructor 52 5323 Record Sequence or Extractor Sequence 53 5324 Release Persistent Browser 54 5325 Repeat 55 5326 Script 56 5327 Sequence 57 5328 Store File 59 5329 Thread 60

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS 61 541 Developing Custom Components 61 542 Using Custom Components 62

55 WRAPPER DEVELOPMENT 62

REFERENCES 63

ITPilot 46 Developer Guide

FIGURES Figure 1 Example of query execution to a wrapper 13 Figure 2 ITPilot Custom Function Sample 17 Figure 3 ITPilot Wrapper Skeleton in JavaScript 18 Figure 4 Using the ExecuteJS NSEQL command 30 Figure 5 Using threads in the Iterator component 45 Figure 6 Using the Loop function 48 Figure 7 Using the Repeat function 55 Figure 8 Using custom components from JavaScript 62

ITPilot 46 Developer Guide

Preface i

PREFACE

SCOPE

Denodo ITPilot enables easy access to and extraction of data from semi-structured Web data sources This document is an introduction to application development using wrappers created by Denodo ITPilot

WHO SHOULD USE THIS DOCUMENT

This document is aimed at developers that want to gain an insight into how applications are developed that make best use of the advanced automation and Web data extraction functionalities provided by Denodo ITPilot The exact detailed information required to install the system and manage is provided in other manuals to which reference will be made as the need arises

SUMMARY OF CONTENTS

More specifically this document

bull Presents the fundamental steps needed to develop an application that uses the wrappers generated by Denodo ITPilot

bull Describes the task of exporting and deploying a wrapper as a Web Service

bull Gives a detailed description of how to use the development API offered by Denodo ITPilot

bull Provides an example of how to develop an application that uses a wrapper installed in a Denodo ITPilot execution server

bull Details how to create custom ITPilot functions

bull Explains how to develop wrappers by using the ITPilot JavaScript components

ITPilot 46 Developer Guide

Introduction 2

1 INTRODUCTION

Denodo ITPilot is a Denodo Technologies solution that enables to extract and structure the data present in Web sources This process is carried out by constructing an abstraction of the target Web source called a ldquowrapperrdquo that frees the client applications of the difficulties associated with accessing and extracting the required data ITPilot provides a distributed and scalable environment for generating executing and maintaining ldquowrappersrdquo See [USER] and [GENER] for more information on how to create install and maintain wrappers using Denodo ITPilot This manual describes the JAVA development API that allows creating clients that use wrappers that have already been generated and installed The basic guidelines for using the API are given the main components are described and some examples of use are provided See Javadoc documentation [JDOC] for more details on classes attributes and operations Besides this manual explains how to access wrappers through Web Services exported in the execution environment

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 3

2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES

The wrappers saved in the execution server can be invoked in two different ways Firstly the native ITPilot Java API can be used to access the wrappers obtain their data structure and run queries on them from a Java application Their description can be found in section 3 Another option is to expose these wrappers through Web Services This latter option is described in this section A Web Service containing the following operations can be generated for a particular wrapper

bull An operation containing all searchable and compulsory parameters

bull Optionally another operation with all searchable and compulsory parameters plus any searchable and optional parameters selected in the Web Service generation process (this process is defined in [USER])

The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server

21 WEB SERVICE TYPES

ITPilot allows one wrapper to be published as a Web Service to enable use by any external application The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server The types of Web services that ITPilot can publish are

bull SOAP [SOAP] Web Services bull REST-style Web Services that use HTTP directly as the transport protocol and return data encoded in XML bull HTML Web Services Similar to the REST-style Web services but the output consists of an HTML table

containing the response data for the query executed The table includes JavaScript code to sort the results by any field andor paginate the returned results It is also possible to adjust the size of the table and the cells and to modify its graphic appearance using a CSS file

The following section describes the querying process for these Web Services

22 INVOKING SOAP WEB SERVICES

The SOAP version of the published Web Services can be accessed by using any Web Service client or client generator that meets SOAP12 [SOAP] and WSDL 11 [WSDL] standards such as the Apache Axis wsdl2java [AXIS] or NET Framework wsdl [DOTNET] tools The WSDL from which the clients are generated can be obtained either from the local file created by ITPilot or through the access URL to the Web Service WSDL httpltdomaingtltportgtltservice_namegtservicesltservice_namegtwsdl ITPilot distribution in the samplesitpilotitp-clients directory contains a sample client generated using Apache Axis The README file residing in this path contains detailed information on how to generate compile and run the files comprising the client application

23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES

This section describes how to invoke the REST and HTML versions of the Web Services that have been published by DataPort once they have been deployed in the Web Service container Once the war file has been deployed in the J2EE application server the relative paths rest and html of the webapp show an information screen of the respective Web service version which shows the available operations

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 4

Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be

httpacme9090testWSrest httpacme9090testWShtml

For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used

httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA

httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following

httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen

where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions

httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA

If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing

httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1

231 HTML Output Configuration

The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows

bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 5

bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval

bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected

bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines

bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar

bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added

bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added

These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format

httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen

For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10

24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES

When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool

1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo

ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 6

3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established

ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

ITPilot Development API 7

3 ITPILOT DEVELOPMENT API

Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]

31 CONNECTING TO THE SERVER

There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)

In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access

ITPilot 46 Developer Guide

ITPilot Development API 8

32 OBTAINING WRAPPERS

As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it

bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained

bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter

bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server

bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server

bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server

bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server

33 USING WRAPPERS

Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method

HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method

HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date

ITPilot 46 Developer Guide

ITPilot Development API 9

Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods

void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)

allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot

34 PROCESSING QUERY RESULTS

The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)

ITPilot 46 Developer Guide

ITPilot Development API 10

The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator

bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not

bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method

ITPilot 46 Developer Guide

ITPilot Development API 11

341 Canceling Queries

The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query

void cancel()

35 EXAMPLE OF USE

This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section

TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 4: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

5312 Fetch 33 5313 Filter 35 5314 Form Iterator 36 5315 Get Page 40 5316 Init 41 5317 Iterator 45 5318 JDBCExtractor 46 5319 Loop 48 5320 Next Interval Iterator 49 5321 Output 51 5322 Record Constructor 52 5323 Record Sequence or Extractor Sequence 53 5324 Release Persistent Browser 54 5325 Repeat 55 5326 Script 56 5327 Sequence 57 5328 Store File 59 5329 Thread 60

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS 61 541 Developing Custom Components 61 542 Using Custom Components 62

55 WRAPPER DEVELOPMENT 62

REFERENCES 63

ITPilot 46 Developer Guide

FIGURES Figure 1 Example of query execution to a wrapper 13 Figure 2 ITPilot Custom Function Sample 17 Figure 3 ITPilot Wrapper Skeleton in JavaScript 18 Figure 4 Using the ExecuteJS NSEQL command 30 Figure 5 Using threads in the Iterator component 45 Figure 6 Using the Loop function 48 Figure 7 Using the Repeat function 55 Figure 8 Using custom components from JavaScript 62

ITPilot 46 Developer Guide

Preface i

PREFACE

SCOPE

Denodo ITPilot enables easy access to and extraction of data from semi-structured Web data sources This document is an introduction to application development using wrappers created by Denodo ITPilot

WHO SHOULD USE THIS DOCUMENT

This document is aimed at developers that want to gain an insight into how applications are developed that make best use of the advanced automation and Web data extraction functionalities provided by Denodo ITPilot The exact detailed information required to install the system and manage is provided in other manuals to which reference will be made as the need arises

SUMMARY OF CONTENTS

More specifically this document

bull Presents the fundamental steps needed to develop an application that uses the wrappers generated by Denodo ITPilot

bull Describes the task of exporting and deploying a wrapper as a Web Service

bull Gives a detailed description of how to use the development API offered by Denodo ITPilot

bull Provides an example of how to develop an application that uses a wrapper installed in a Denodo ITPilot execution server

bull Details how to create custom ITPilot functions

bull Explains how to develop wrappers by using the ITPilot JavaScript components

ITPilot 46 Developer Guide

Introduction 2

1 INTRODUCTION

Denodo ITPilot is a Denodo Technologies solution that enables to extract and structure the data present in Web sources This process is carried out by constructing an abstraction of the target Web source called a ldquowrapperrdquo that frees the client applications of the difficulties associated with accessing and extracting the required data ITPilot provides a distributed and scalable environment for generating executing and maintaining ldquowrappersrdquo See [USER] and [GENER] for more information on how to create install and maintain wrappers using Denodo ITPilot This manual describes the JAVA development API that allows creating clients that use wrappers that have already been generated and installed The basic guidelines for using the API are given the main components are described and some examples of use are provided See Javadoc documentation [JDOC] for more details on classes attributes and operations Besides this manual explains how to access wrappers through Web Services exported in the execution environment

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 3

2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES

The wrappers saved in the execution server can be invoked in two different ways Firstly the native ITPilot Java API can be used to access the wrappers obtain their data structure and run queries on them from a Java application Their description can be found in section 3 Another option is to expose these wrappers through Web Services This latter option is described in this section A Web Service containing the following operations can be generated for a particular wrapper

bull An operation containing all searchable and compulsory parameters

bull Optionally another operation with all searchable and compulsory parameters plus any searchable and optional parameters selected in the Web Service generation process (this process is defined in [USER])

The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server

21 WEB SERVICE TYPES

ITPilot allows one wrapper to be published as a Web Service to enable use by any external application The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server The types of Web services that ITPilot can publish are

bull SOAP [SOAP] Web Services bull REST-style Web Services that use HTTP directly as the transport protocol and return data encoded in XML bull HTML Web Services Similar to the REST-style Web services but the output consists of an HTML table

containing the response data for the query executed The table includes JavaScript code to sort the results by any field andor paginate the returned results It is also possible to adjust the size of the table and the cells and to modify its graphic appearance using a CSS file

The following section describes the querying process for these Web Services

22 INVOKING SOAP WEB SERVICES

The SOAP version of the published Web Services can be accessed by using any Web Service client or client generator that meets SOAP12 [SOAP] and WSDL 11 [WSDL] standards such as the Apache Axis wsdl2java [AXIS] or NET Framework wsdl [DOTNET] tools The WSDL from which the clients are generated can be obtained either from the local file created by ITPilot or through the access URL to the Web Service WSDL httpltdomaingtltportgtltservice_namegtservicesltservice_namegtwsdl ITPilot distribution in the samplesitpilotitp-clients directory contains a sample client generated using Apache Axis The README file residing in this path contains detailed information on how to generate compile and run the files comprising the client application

23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES

This section describes how to invoke the REST and HTML versions of the Web Services that have been published by DataPort once they have been deployed in the Web Service container Once the war file has been deployed in the J2EE application server the relative paths rest and html of the webapp show an information screen of the respective Web service version which shows the available operations

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 4

Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be

httpacme9090testWSrest httpacme9090testWShtml

For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used

httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA

httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following

httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen

where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions

httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA

If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing

httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1

231 HTML Output Configuration

The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows

bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 5

bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval

bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected

bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines

bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar

bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added

bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added

These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format

httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen

For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10

24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES

When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool

1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo

ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 6

3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established

ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

ITPilot Development API 7

3 ITPILOT DEVELOPMENT API

Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]

31 CONNECTING TO THE SERVER

There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)

In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access

ITPilot 46 Developer Guide

ITPilot Development API 8

32 OBTAINING WRAPPERS

As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it

bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained

bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter

bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server

bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server

bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server

bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server

33 USING WRAPPERS

Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method

HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method

HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date

ITPilot 46 Developer Guide

ITPilot Development API 9

Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods

void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)

allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot

34 PROCESSING QUERY RESULTS

The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)

ITPilot 46 Developer Guide

ITPilot Development API 10

The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator

bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not

bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method

ITPilot 46 Developer Guide

ITPilot Development API 11

341 Canceling Queries

The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query

void cancel()

35 EXAMPLE OF USE

This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section

TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 5: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

FIGURES Figure 1 Example of query execution to a wrapper 13 Figure 2 ITPilot Custom Function Sample 17 Figure 3 ITPilot Wrapper Skeleton in JavaScript 18 Figure 4 Using the ExecuteJS NSEQL command 30 Figure 5 Using threads in the Iterator component 45 Figure 6 Using the Loop function 48 Figure 7 Using the Repeat function 55 Figure 8 Using custom components from JavaScript 62

ITPilot 46 Developer Guide

Preface i

PREFACE

SCOPE

Denodo ITPilot enables easy access to and extraction of data from semi-structured Web data sources This document is an introduction to application development using wrappers created by Denodo ITPilot

WHO SHOULD USE THIS DOCUMENT

This document is aimed at developers that want to gain an insight into how applications are developed that make best use of the advanced automation and Web data extraction functionalities provided by Denodo ITPilot The exact detailed information required to install the system and manage is provided in other manuals to which reference will be made as the need arises

SUMMARY OF CONTENTS

More specifically this document

bull Presents the fundamental steps needed to develop an application that uses the wrappers generated by Denodo ITPilot

bull Describes the task of exporting and deploying a wrapper as a Web Service

bull Gives a detailed description of how to use the development API offered by Denodo ITPilot

bull Provides an example of how to develop an application that uses a wrapper installed in a Denodo ITPilot execution server

bull Details how to create custom ITPilot functions

bull Explains how to develop wrappers by using the ITPilot JavaScript components

ITPilot 46 Developer Guide

Introduction 2

1 INTRODUCTION

Denodo ITPilot is a Denodo Technologies solution that enables to extract and structure the data present in Web sources This process is carried out by constructing an abstraction of the target Web source called a ldquowrapperrdquo that frees the client applications of the difficulties associated with accessing and extracting the required data ITPilot provides a distributed and scalable environment for generating executing and maintaining ldquowrappersrdquo See [USER] and [GENER] for more information on how to create install and maintain wrappers using Denodo ITPilot This manual describes the JAVA development API that allows creating clients that use wrappers that have already been generated and installed The basic guidelines for using the API are given the main components are described and some examples of use are provided See Javadoc documentation [JDOC] for more details on classes attributes and operations Besides this manual explains how to access wrappers through Web Services exported in the execution environment

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 3

2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES

The wrappers saved in the execution server can be invoked in two different ways Firstly the native ITPilot Java API can be used to access the wrappers obtain their data structure and run queries on them from a Java application Their description can be found in section 3 Another option is to expose these wrappers through Web Services This latter option is described in this section A Web Service containing the following operations can be generated for a particular wrapper

bull An operation containing all searchable and compulsory parameters

bull Optionally another operation with all searchable and compulsory parameters plus any searchable and optional parameters selected in the Web Service generation process (this process is defined in [USER])

The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server

21 WEB SERVICE TYPES

ITPilot allows one wrapper to be published as a Web Service to enable use by any external application The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server The types of Web services that ITPilot can publish are

bull SOAP [SOAP] Web Services bull REST-style Web Services that use HTTP directly as the transport protocol and return data encoded in XML bull HTML Web Services Similar to the REST-style Web services but the output consists of an HTML table

containing the response data for the query executed The table includes JavaScript code to sort the results by any field andor paginate the returned results It is also possible to adjust the size of the table and the cells and to modify its graphic appearance using a CSS file

The following section describes the querying process for these Web Services

22 INVOKING SOAP WEB SERVICES

The SOAP version of the published Web Services can be accessed by using any Web Service client or client generator that meets SOAP12 [SOAP] and WSDL 11 [WSDL] standards such as the Apache Axis wsdl2java [AXIS] or NET Framework wsdl [DOTNET] tools The WSDL from which the clients are generated can be obtained either from the local file created by ITPilot or through the access URL to the Web Service WSDL httpltdomaingtltportgtltservice_namegtservicesltservice_namegtwsdl ITPilot distribution in the samplesitpilotitp-clients directory contains a sample client generated using Apache Axis The README file residing in this path contains detailed information on how to generate compile and run the files comprising the client application

23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES

This section describes how to invoke the REST and HTML versions of the Web Services that have been published by DataPort once they have been deployed in the Web Service container Once the war file has been deployed in the J2EE application server the relative paths rest and html of the webapp show an information screen of the respective Web service version which shows the available operations

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 4

Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be

httpacme9090testWSrest httpacme9090testWShtml

For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used

httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA

httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following

httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen

where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions

httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA

If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing

httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1

231 HTML Output Configuration

The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows

bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 5

bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval

bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected

bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines

bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar

bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added

bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added

These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format

httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen

For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10

24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES

When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool

1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo

ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 6

3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established

ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

ITPilot Development API 7

3 ITPILOT DEVELOPMENT API

Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]

31 CONNECTING TO THE SERVER

There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)

In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access

ITPilot 46 Developer Guide

ITPilot Development API 8

32 OBTAINING WRAPPERS

As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it

bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained

bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter

bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server

bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server

bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server

bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server

33 USING WRAPPERS

Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method

HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method

HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date

ITPilot 46 Developer Guide

ITPilot Development API 9

Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods

void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)

allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot

34 PROCESSING QUERY RESULTS

The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)

ITPilot 46 Developer Guide

ITPilot Development API 10

The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator

bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not

bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method

ITPilot 46 Developer Guide

ITPilot Development API 11

341 Canceling Queries

The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query

void cancel()

35 EXAMPLE OF USE

This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section

TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 6: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Preface i

PREFACE

SCOPE

Denodo ITPilot enables easy access to and extraction of data from semi-structured Web data sources This document is an introduction to application development using wrappers created by Denodo ITPilot

WHO SHOULD USE THIS DOCUMENT

This document is aimed at developers that want to gain an insight into how applications are developed that make best use of the advanced automation and Web data extraction functionalities provided by Denodo ITPilot The exact detailed information required to install the system and manage is provided in other manuals to which reference will be made as the need arises

SUMMARY OF CONTENTS

More specifically this document

bull Presents the fundamental steps needed to develop an application that uses the wrappers generated by Denodo ITPilot

bull Describes the task of exporting and deploying a wrapper as a Web Service

bull Gives a detailed description of how to use the development API offered by Denodo ITPilot

bull Provides an example of how to develop an application that uses a wrapper installed in a Denodo ITPilot execution server

bull Details how to create custom ITPilot functions

bull Explains how to develop wrappers by using the ITPilot JavaScript components

ITPilot 46 Developer Guide

Introduction 2

1 INTRODUCTION

Denodo ITPilot is a Denodo Technologies solution that enables to extract and structure the data present in Web sources This process is carried out by constructing an abstraction of the target Web source called a ldquowrapperrdquo that frees the client applications of the difficulties associated with accessing and extracting the required data ITPilot provides a distributed and scalable environment for generating executing and maintaining ldquowrappersrdquo See [USER] and [GENER] for more information on how to create install and maintain wrappers using Denodo ITPilot This manual describes the JAVA development API that allows creating clients that use wrappers that have already been generated and installed The basic guidelines for using the API are given the main components are described and some examples of use are provided See Javadoc documentation [JDOC] for more details on classes attributes and operations Besides this manual explains how to access wrappers through Web Services exported in the execution environment

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 3

2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES

The wrappers saved in the execution server can be invoked in two different ways Firstly the native ITPilot Java API can be used to access the wrappers obtain their data structure and run queries on them from a Java application Their description can be found in section 3 Another option is to expose these wrappers through Web Services This latter option is described in this section A Web Service containing the following operations can be generated for a particular wrapper

bull An operation containing all searchable and compulsory parameters

bull Optionally another operation with all searchable and compulsory parameters plus any searchable and optional parameters selected in the Web Service generation process (this process is defined in [USER])

The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server

21 WEB SERVICE TYPES

ITPilot allows one wrapper to be published as a Web Service to enable use by any external application The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server The types of Web services that ITPilot can publish are

bull SOAP [SOAP] Web Services bull REST-style Web Services that use HTTP directly as the transport protocol and return data encoded in XML bull HTML Web Services Similar to the REST-style Web services but the output consists of an HTML table

containing the response data for the query executed The table includes JavaScript code to sort the results by any field andor paginate the returned results It is also possible to adjust the size of the table and the cells and to modify its graphic appearance using a CSS file

The following section describes the querying process for these Web Services

22 INVOKING SOAP WEB SERVICES

The SOAP version of the published Web Services can be accessed by using any Web Service client or client generator that meets SOAP12 [SOAP] and WSDL 11 [WSDL] standards such as the Apache Axis wsdl2java [AXIS] or NET Framework wsdl [DOTNET] tools The WSDL from which the clients are generated can be obtained either from the local file created by ITPilot or through the access URL to the Web Service WSDL httpltdomaingtltportgtltservice_namegtservicesltservice_namegtwsdl ITPilot distribution in the samplesitpilotitp-clients directory contains a sample client generated using Apache Axis The README file residing in this path contains detailed information on how to generate compile and run the files comprising the client application

23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES

This section describes how to invoke the REST and HTML versions of the Web Services that have been published by DataPort once they have been deployed in the Web Service container Once the war file has been deployed in the J2EE application server the relative paths rest and html of the webapp show an information screen of the respective Web service version which shows the available operations

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 4

Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be

httpacme9090testWSrest httpacme9090testWShtml

For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used

httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA

httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following

httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen

where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions

httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA

If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing

httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1

231 HTML Output Configuration

The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows

bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 5

bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval

bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected

bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines

bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar

bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added

bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added

These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format

httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen

For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10

24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES

When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool

1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo

ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 6

3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established

ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

ITPilot Development API 7

3 ITPILOT DEVELOPMENT API

Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]

31 CONNECTING TO THE SERVER

There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)

In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access

ITPilot 46 Developer Guide

ITPilot Development API 8

32 OBTAINING WRAPPERS

As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it

bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained

bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter

bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server

bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server

bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server

bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server

33 USING WRAPPERS

Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method

HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method

HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date

ITPilot 46 Developer Guide

ITPilot Development API 9

Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods

void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)

allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot

34 PROCESSING QUERY RESULTS

The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)

ITPilot 46 Developer Guide

ITPilot Development API 10

The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator

bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not

bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method

ITPilot 46 Developer Guide

ITPilot Development API 11

341 Canceling Queries

The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query

void cancel()

35 EXAMPLE OF USE

This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section

TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 7: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Introduction 2

1 INTRODUCTION

Denodo ITPilot is a Denodo Technologies solution that enables to extract and structure the data present in Web sources This process is carried out by constructing an abstraction of the target Web source called a ldquowrapperrdquo that frees the client applications of the difficulties associated with accessing and extracting the required data ITPilot provides a distributed and scalable environment for generating executing and maintaining ldquowrappersrdquo See [USER] and [GENER] for more information on how to create install and maintain wrappers using Denodo ITPilot This manual describes the JAVA development API that allows creating clients that use wrappers that have already been generated and installed The basic guidelines for using the API are given the main components are described and some examples of use are provided See Javadoc documentation [JDOC] for more details on classes attributes and operations Besides this manual explains how to access wrappers through Web Services exported in the execution environment

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 3

2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES

The wrappers saved in the execution server can be invoked in two different ways Firstly the native ITPilot Java API can be used to access the wrappers obtain their data structure and run queries on them from a Java application Their description can be found in section 3 Another option is to expose these wrappers through Web Services This latter option is described in this section A Web Service containing the following operations can be generated for a particular wrapper

bull An operation containing all searchable and compulsory parameters

bull Optionally another operation with all searchable and compulsory parameters plus any searchable and optional parameters selected in the Web Service generation process (this process is defined in [USER])

The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server

21 WEB SERVICE TYPES

ITPilot allows one wrapper to be published as a Web Service to enable use by any external application The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server The types of Web services that ITPilot can publish are

bull SOAP [SOAP] Web Services bull REST-style Web Services that use HTTP directly as the transport protocol and return data encoded in XML bull HTML Web Services Similar to the REST-style Web services but the output consists of an HTML table

containing the response data for the query executed The table includes JavaScript code to sort the results by any field andor paginate the returned results It is also possible to adjust the size of the table and the cells and to modify its graphic appearance using a CSS file

The following section describes the querying process for these Web Services

22 INVOKING SOAP WEB SERVICES

The SOAP version of the published Web Services can be accessed by using any Web Service client or client generator that meets SOAP12 [SOAP] and WSDL 11 [WSDL] standards such as the Apache Axis wsdl2java [AXIS] or NET Framework wsdl [DOTNET] tools The WSDL from which the clients are generated can be obtained either from the local file created by ITPilot or through the access URL to the Web Service WSDL httpltdomaingtltportgtltservice_namegtservicesltservice_namegtwsdl ITPilot distribution in the samplesitpilotitp-clients directory contains a sample client generated using Apache Axis The README file residing in this path contains detailed information on how to generate compile and run the files comprising the client application

23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES

This section describes how to invoke the REST and HTML versions of the Web Services that have been published by DataPort once they have been deployed in the Web Service container Once the war file has been deployed in the J2EE application server the relative paths rest and html of the webapp show an information screen of the respective Web service version which shows the available operations

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 4

Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be

httpacme9090testWSrest httpacme9090testWShtml

For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used

httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA

httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following

httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen

where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions

httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA

If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing

httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1

231 HTML Output Configuration

The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows

bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 5

bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval

bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected

bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines

bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar

bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added

bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added

These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format

httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen

For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10

24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES

When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool

1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo

ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 6

3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established

ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

ITPilot Development API 7

3 ITPILOT DEVELOPMENT API

Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]

31 CONNECTING TO THE SERVER

There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)

In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access

ITPilot 46 Developer Guide

ITPilot Development API 8

32 OBTAINING WRAPPERS

As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it

bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained

bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter

bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server

bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server

bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server

bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server

33 USING WRAPPERS

Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method

HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method

HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date

ITPilot 46 Developer Guide

ITPilot Development API 9

Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods

void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)

allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot

34 PROCESSING QUERY RESULTS

The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)

ITPilot 46 Developer Guide

ITPilot Development API 10

The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator

bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not

bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method

ITPilot 46 Developer Guide

ITPilot Development API 11

341 Canceling Queries

The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query

void cancel()

35 EXAMPLE OF USE

This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section

TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 8: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 3

2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES

The wrappers saved in the execution server can be invoked in two different ways Firstly the native ITPilot Java API can be used to access the wrappers obtain their data structure and run queries on them from a Java application Their description can be found in section 3 Another option is to expose these wrappers through Web Services This latter option is described in this section A Web Service containing the following operations can be generated for a particular wrapper

bull An operation containing all searchable and compulsory parameters

bull Optionally another operation with all searchable and compulsory parameters plus any searchable and optional parameters selected in the Web Service generation process (this process is defined in [USER])

The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server

21 WEB SERVICE TYPES

ITPilot allows one wrapper to be published as a Web Service to enable use by any external application The ITPilot execution server generates a Web Service as a war file that can be deployed in any J2EE application server The types of Web services that ITPilot can publish are

bull SOAP [SOAP] Web Services bull REST-style Web Services that use HTTP directly as the transport protocol and return data encoded in XML bull HTML Web Services Similar to the REST-style Web services but the output consists of an HTML table

containing the response data for the query executed The table includes JavaScript code to sort the results by any field andor paginate the returned results It is also possible to adjust the size of the table and the cells and to modify its graphic appearance using a CSS file

The following section describes the querying process for these Web Services

22 INVOKING SOAP WEB SERVICES

The SOAP version of the published Web Services can be accessed by using any Web Service client or client generator that meets SOAP12 [SOAP] and WSDL 11 [WSDL] standards such as the Apache Axis wsdl2java [AXIS] or NET Framework wsdl [DOTNET] tools The WSDL from which the clients are generated can be obtained either from the local file created by ITPilot or through the access URL to the Web Service WSDL httpltdomaingtltportgtltservice_namegtservicesltservice_namegtwsdl ITPilot distribution in the samplesitpilotitp-clients directory contains a sample client generated using Apache Axis The README file residing in this path contains detailed information on how to generate compile and run the files comprising the client application

23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES

This section describes how to invoke the REST and HTML versions of the Web Services that have been published by DataPort once they have been deployed in the Web Service container Once the war file has been deployed in the J2EE application server the relative paths rest and html of the webapp show an information screen of the respective Web service version which shows the available operations

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 4

Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be

httpacme9090testWSrest httpacme9090testWShtml

For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used

httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA

httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following

httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen

where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions

httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA

If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing

httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1

231 HTML Output Configuration

The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows

bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 5

bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval

bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected

bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines

bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar

bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added

bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added

These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format

httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen

For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10

24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES

When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool

1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo

ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 6

3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established

ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

ITPilot Development API 7

3 ITPILOT DEVELOPMENT API

Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]

31 CONNECTING TO THE SERVER

There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)

In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access

ITPilot 46 Developer Guide

ITPilot Development API 8

32 OBTAINING WRAPPERS

As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it

bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained

bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter

bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server

bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server

bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server

bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server

33 USING WRAPPERS

Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method

HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method

HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date

ITPilot 46 Developer Guide

ITPilot Development API 9

Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods

void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)

allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot

34 PROCESSING QUERY RESULTS

The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)

ITPilot 46 Developer Guide

ITPilot Development API 10

The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator

bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not

bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method

ITPilot 46 Developer Guide

ITPilot Development API 11

341 Canceling Queries

The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query

void cancel()

35 EXAMPLE OF USE

This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section

TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 9: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 4

Example if the Web service container is running on port 9090 of the acme host and the name chosen for the exported web service was testWS the access URL for the information page in the REST (XML output) and HTML versions would be

httpacme9090testWSrest httpacme9090testWShtml

For each operation the input and output parameters are shown For the REST version a link to the xsd file which describes the schema of the XML document which will return the call of each operation is also shown To access the XML Schema of the data returned by invoking an operation of the REST version of the Web service the following URL format should be used

httphostportserviceNamerestopNamexsd Example again if the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS the following URL will obtain the XML Schema of the data returned by the operation getPRODUCTDATA

httpacme9090testWSrestgetPRODUCTDATAxsd The format used to invoke a specific operation in the REST version is the following

httphostportserviceNamerestopNameparamName1=value1ampampampparamNamn=valuen

where n is the number of parameters of the operation The format for the HTML version is the same but replacing lsquorestrsquo by lsquohtmlrsquo Example the Web service container runs on port 9090 of the acme host and the name chosen for the exported web service was testWS Let us also suppose that the service has an operation called getPRODUCTDATA that requires no parameters The operation can be invoked as follows in respectively the REST and HTML Web service versions

httpacme9090testWSrestgetPRODUCTDATA httpacme9090testWShtmlgetPRODUCTDATA

If the operation to be invoked is getPRODUCTDATABYPRODID which requires one input parameter called prod_id the results when this parameter has a value equals to 1 would be obtained by writing

httpacme9090testWSrestgetPRODUCTDATABYPRODIDprod_id=1 httpacme9090testWShtmlgetPRODUCTDATABYPRODIDprod_id=1

231 HTML Output Configuration

The HTML version of the Web Services published may be invoked with certain additional parameters to configure the HTML table used to display the results of the queries The additional parameters are as follows

bull shownumresults If this parameter is indicated with the true value the table will display information on the number of results obtained by the wrapper

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 5

bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval

bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected

bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines

bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar

bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added

bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added

These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format

httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen

For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10

24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES

When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool

1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo

ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 6

3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established

ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

ITPilot Development API 7

3 ITPILOT DEVELOPMENT API

Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]

31 CONNECTING TO THE SERVER

There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)

In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access

ITPilot 46 Developer Guide

ITPilot Development API 8

32 OBTAINING WRAPPERS

As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it

bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained

bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter

bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server

bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server

bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server

bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server

33 USING WRAPPERS

Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method

HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method

HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date

ITPilot 46 Developer Guide

ITPilot Development API 9

Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods

void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)

allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot

34 PROCESSING QUERY RESULTS

The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)

ITPilot 46 Developer Guide

ITPilot Development API 10

The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator

bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not

bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method

ITPilot 46 Developer Guide

ITPilot Development API 11

341 Canceling Queries

The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query

void cancel()

35 EXAMPLE OF USE

This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section

TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 10: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 5

bull intervalsize If this parameter is indicated the results obtained by the wrapper will be displayed paginated The value of the parameter indicates the number of results to be displayed in each interval

bull maxresults This indicates a maximum number of results to be displayed If the wrapper run returns more results than those indicated all excess results will be rejected

bull cellwidth Maximum cell width expressed in number of characters The width of each cell in the table will be adapted to the text except where the size indicated in this parameter is exceeded In this case carriage returns will be added to divide the text into lines

bull cellheight Maximum number of lines in a cell after having divided the text according to the cellwidth parameter value If this is exceeded all the cells of this column are given a scroll bar

bull width This specifies the maximum width (in pixels) of the table If the size is exceeded a scroll bar is added

bull height This specifies the maximum height (in pixels) of the table If the size is exceeded a scroll bar is added

These parameters must be indicated in the part of the URL corresponding to the access path (before the query parameters) in the following format

httphostportserviceNamehtmlopNameparamName1value1paramNamenvaluen

For example the following expression invokes the getPRODUCTDATA operation limiting the number of results displayed to 50 and setting a maximum pagination interval size equal to 10 Once again it is presumed that the Web service container be run in the 9090 port of the acme machine httpacme9090testWShtmlgetPRODUCTDATAmaxresults50intervalsize10

24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES

When the Web Service operations have been exported there are some parameters that can used to configure the connection pool used by the Web Services to connect to the ITPilot server The webxml file that can be found in the path WEB-INF of the exported web service (either inside of the war file generated by ITPilot or from the directory where the Web Service has been deployed) has three parameters used to configure the connection pool

1 poolEnabled this parameter is used to enable or disable the connection pool The possible values are ldquotruerdquo or ldquofalserdquo

ltenv-entrygt ltenv-entry-namegtpoolEnabledltenv-entry-namegt ltenv-entry-valuegtfalseltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

2 poolInitSize defines the initial size of the connections pool ltenv-entrygt ltenv-entry-namegtpoolInitSizeltenv-entry-namegt ltenv-entry-valuegt0ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 6

3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established

ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

ITPilot Development API 7

3 ITPILOT DEVELOPMENT API

Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]

31 CONNECTING TO THE SERVER

There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)

In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access

ITPilot 46 Developer Guide

ITPilot Development API 8

32 OBTAINING WRAPPERS

As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it

bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained

bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter

bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server

bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server

bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server

bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server

33 USING WRAPPERS

Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method

HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method

HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date

ITPilot 46 Developer Guide

ITPilot Development API 9

Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods

void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)

allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot

34 PROCESSING QUERY RESULTS

The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)

ITPilot 46 Developer Guide

ITPilot Development API 10

The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator

bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not

bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method

ITPilot 46 Developer Guide

ITPilot Development API 11

341 Canceling Queries

The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query

void cancel()

35 EXAMPLE OF USE

This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section

TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 11: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Deploying and Invoking ITPilot Wrapper Access Web Services 6

3 poolMaxActive defines the maximum number of active connections in the pool when the number of connections exceeds this parameter value new requests will be queued until a free connection is established

ltenv-entrygt ltenv-entry-namegtpoolMaxActiveltenv-entry-namegt ltenv-entry-valuegt30ltenv-entry-valuegt ltenv-entry-typegtjavalangStringltenv-entry-typegt ltenv-entrygt

ITPilot 46 Developer Guide

ITPilot Development API 7

3 ITPILOT DEVELOPMENT API

Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]

31 CONNECTING TO THE SERVER

There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)

In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access

ITPilot 46 Developer Guide

ITPilot Development API 8

32 OBTAINING WRAPPERS

As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it

bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained

bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter

bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server

bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server

bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server

bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server

33 USING WRAPPERS

Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method

HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method

HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date

ITPilot 46 Developer Guide

ITPilot Development API 9

Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods

void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)

allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot

34 PROCESSING QUERY RESULTS

The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)

ITPilot 46 Developer Guide

ITPilot Development API 10

The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator

bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not

bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method

ITPilot 46 Developer Guide

ITPilot Development API 11

341 Canceling Queries

The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query

void cancel()

35 EXAMPLE OF USE

This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section

TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 12: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

ITPilot Development API 7

3 ITPILOT DEVELOPMENT API

Denodo ITPilot incorporates a JAVA API for developing applications using the wrappers created with it Amongst other functions this API facilitates connection to a Denodo ITPilot execution server obtaining a reference to a wrapper installed in said server and querying it It also allows a series of additional tasks like obtaining the list of wrappers installed in the server or activating automatic maintenance of a specific wrapper The first step in using the API is to connect to a Denodo ITPilot execution server This is done by constructing an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy Amongst other tasks said instance will allow to obtain a list of the available wrappers in the server as well as a reference to a specific wrapper represented through an instance of the class HTMLWrapperProxy Said instance may be used to carry out various tasks on the wrapper the most important of which is query execution When a query is invoked on the wrapper the results are returned to the application in an asynchronous manner (ie the first results of the query will be accessible to the application as they are obtained from the source without having to wait for all the results to be received) The following subsections deal in more detail with each of the stages mentioned connection to the server obtaining references to wrappers executing actions on them and query processing An exhaustive description of the API on a programming level can be found in the Javadoc documentation [JDOC]

31 CONNECTING TO THE SERVER

There are two ways in which a connection to the ITPilot execution server can be added depending on whether Denodo Virtual DataPort [DPORT] is installed in the same location as ITPilot If Denodo ITPilot has been installed separately then the default server connection mode should be used (constructor HTMLWrapperServerProxy(String host int port)) indicating the machine and port in which the server is executed If Denodo ITPilot is installed jointly with Denodo Virtual DataPort then DataPort will be used as an execution server for ITPilot In this case it is possible to specify any database created in the Virtual DataPort server in the connection to the server and use any user defined in it The actions allowed for the user will be coherent with the permissions assigned to said user in the DataPort server for the specified database (see [DPORT] for more information on the structure of databases permissions and users of Denodo Virtual DataPort)

In this case the constructor HTMLWrapperServerProxy(String host int port String dbName String login String password) may be used In this constructor in addition to the machine and port in which the server is executed the name of the database of the Virtual DataPort server to which the connection is to be made should be specified as well as the user ID with which access is to be made and the associated password It is important to highlight that even if Virtual DataPort is installed it is equally possible to access the server using the default mode (constructor HTMLWrapperServerProxy(String host int port)) In this case a default database called lsquoitpilotrsquo will be accessed The predefined user lsquoadminrsquo (with the initial password lsquoadminrsquo) will be used to gain access

ITPilot 46 Developer Guide

ITPilot Development API 8

32 OBTAINING WRAPPERS

As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it

bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained

bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter

bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server

bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server

bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server

bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server

33 USING WRAPPERS

Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method

HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method

HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date

ITPilot 46 Developer Guide

ITPilot Development API 9

Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods

void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)

allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot

34 PROCESSING QUERY RESULTS

The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)

ITPilot 46 Developer Guide

ITPilot Development API 10

The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator

bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not

bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method

ITPilot 46 Developer Guide

ITPilot Development API 11

341 Canceling Queries

The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query

void cancel()

35 EXAMPLE OF USE

This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section

TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 13: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

ITPilot Development API 8

32 OBTAINING WRAPPERS

As mentioned in the preceding section connection to the execution server consists of creating an instance of the class comdenodoitpilotclientHTMLWrapperServerProxy This class incorporates methods for obtaining data on the execution server and accessing wrappers present in it

bull Collection getHTMLWrapperNames() Obtains a collection with the name of the wrappers present in the execution server Note that if Virtual DataPort is being used as execution server the connection will have been made to a Virtual DataPort database and only those wrappers associated with said database will be obtained

bull HTMLWrapperProxy getHTMLWrapper(String wpName) Obtains a reference to the wrapper of the name specified as parameter

bull Collection getDatabaseNames() This method can only be invoked by users with administration rights in Virtual DataPort It returns a collection with the name of the databases that exist in the server

bull void deleteWrapper(String wpName) Deletes the wrapper which name is specified as parameter from the Server

bull void loadWrapper(String vql) Takes as input argument the VQL that defines a collection of wrappers that are loaded in the execution server

bull String getVQL() Returns the VQL description of all wrappers in the ITPilot execution server

33 USING WRAPPERS

Once a reference to a wrapper has been obtained (instance of the class HTMLWrapperProxy) various operations can be carried out on it through the methods of said class To execute a query to a wrapper we will use the method

HTMLWrapperResultIterator query(Map params) The query to be executed is represented as a map of pairs name of attributevalue The attribute names must match the names of the input parameters specified during the creation of the wrapper The values must be specified as character strings even when the input parameters expected by the wrapper belong to other type For example if a wrapper is expecting a float-type parameter and we want to assign the value 325 when invoking it we must pass the ldquo325rdquo string In the case of float double and date data types it is important to make sure that the values are provided according to the internationalization configuration specified in the wrapper Init component or in case of date data types the date pattern if it was set It is important to take into account that for the query to execute correctly a value must be specified for all the mandatory attributes See [GENER] for more information on the process of generating wrappers in ITPilot Although most of the applications will not require this a wrapper schema can be obtained using the method

HTMLWrapperMetaRegisterRawVO getSchema() This method returns the schema of the results returned by the wrapper and the characteristics of the atomic fields that form part of said schema The schema was defined during the generation of the wrapper (see [GENER]) The results returned by a wrapper follow a hierarchical structure Each output tuple contains a value for every attribute contained in the wrapper response Each attribute may be either atomic or compound The value of atomic attributes can be of any of the basic data types available in ITPilot int long float double text date

ITPilot 46 Developer Guide

ITPilot Development API 9

Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods

void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)

allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot

34 PROCESSING QUERY RESULTS

The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)

ITPilot 46 Developer Guide

ITPilot Development API 10

The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator

bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not

bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method

ITPilot 46 Developer Guide

ITPilot Development API 11

341 Canceling Queries

The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query

void cancel()

35 EXAMPLE OF USE

This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section

TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 14: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

ITPilot Development API 9

Boolean or blob The value of a compound attribute is always an array of registers In the same form each register will be composed of several fields and again these fields may be either atomic or compound For example a wrapper that returns data on movies may have a schema in which each result is comprised of the fields TITLE DIRECTOR and EDITIONS TITLE and DIRECTOR are atomic fields and EDITIONS is a compound field containing data on various editions available of the movie (DVD VHS directorrsquos cut etc) The value of EDITIONS is an array of registers where each register contains the fields FORMAT PRICE and DESCRIPTION all of which are atomic The invocation to getSchema() returns an instance of the class HTMLWrapperMetaRegisterRawVO which represents the schema of a ldquohierarchicalrdquo register of the type described above See the Javadoc documentation for a detailed description of the methods provided by HTMLWrapperMetaRegisterRawVO It is also possible to access the characteristics of the various atomic fields that comprise the schema Information about these atomic fields is represented as instances of the class HTMLWrapperMetaSimpleRawVO Specifically the following information can be obtained from an atomic field its type by using the method javalangClass getType() whether the value is obtained from the source or not (that is to know if it is a searchable field that can not be found in the output schema using the method boolean isSearchStatus()) and in that case whether it is mandatory or not (method boolean isMandatoryStatus()) Furthermore if they have been defined during the generation process it is also possible to obtain the regular expression (method javalangString getRegexp()) and the aliases defined for each field (method javautilList getTextValues()) Finally the methods

void setMaintenance(boolean value) void setMaintenance(boolean maintenance boolean regenerate boolean autodeploy)

allows setting via API whether a wrapper should be automatically maintained or not by ITPilot automatic maintenance server The regenerate parameter indicates if ITPilot should try automatically generating a new wrapper when a change in the source is detected The autodeploy parameter indicates if the regenerated wrapper should be automatically installed in the ITPilot server replacing the old one If this last parameter is set to false then the new wrapper will be stored in the path DENODO_HOMEmetadatamaintenance-regenerations The replaced versions of the wrapper are stored in the DENODO_HOMEmetadatamaintenance-backup path (the replacement date is added to the name of the wrapper to generate the file name) If the first method is used (without the regenerate and autodeploy parameters) the wrapper will be regenerated and auto-deployed in the ITPilot server See [USER] for more information about the automatic maintenance process in ITPilot

34 PROCESSING QUERY RESULTS

The query method for executing queries to a wrapper returns as a result an instance of the class comdenodoitpilotclientHTMLWrapperResultIterator This class (which implements the interface javautilIterator) provides asynchronous access to the results of the query made Results being accessed in an asynchronous manner means that the server will return results of the query as they are obtained from the source (it is important to remember that the wrapper obtains the data from the source in real time through the network)

ITPilot 46 Developer Guide

ITPilot Development API 10

The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator

bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not

bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method

ITPilot 46 Developer Guide

ITPilot Development API 11

341 Canceling Queries

The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query

void cancel()

35 EXAMPLE OF USE

This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section

TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 15: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

ITPilot Development API 10

The method hasNext() allows to check if there are still elements to return Due to the asynchronous behavior of this case this method must be used before accessing each element to make sure that data elements are available The method next() of HTMLWrapperResultIterator obtains the next result In this case each result is an instance of the class comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO The value associated with each field will be obtained by invoking the method comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO getValue (String fieldname) where fieldname is the name of the desired field The method next() will throw an exception of type NoSuchElementException if there are no available data at that moment even if the wrapper still has results to return Thus the necessity of using the method hasNext() As mentioned in the preceding section the value of a field can be atomic or compound If it is atomic the instance of ValueVO belongs to the subclass SimpleVO SimpleVO is an abstract class which subclasses are related to the basic types available in ITPilot TextVO IntVO LongVO FloatVO DoubleVO DateVO BooleanVO BlobVO The subclasses IntVO LongVO FloatVO DoubleVO and BooleanVO provide a method getXXX (where XXX represents the name of the data type) to access their values For example IntVO provides the method javalangInteger getInt() In the case of BlobVO the following method is provided javalangByte[] getBytes() In the case of DateVO this is the method long getTime() In addition the SimpleVO superclass provides a representation of the value as a character string accessible through the getValue() method See Javadoc documentation for detail [JDOC] If the value is compound the instance of ValueVO represents an array of registers (subclass ArrayVO) Using its method getValues() a list of the registers it contains can be obtained (instances of the subclass RegisterVO) See the Javadoc documentation to see more detailed information on the methods and properties of the class ValueVO and its subclasses Another important aspect of processing queries is dealing with any errors that may arise (eg error connecting to the data source) There are two methods for this of the class HTMLWrapperResultIterator

bull Boolean checkErrors() Allows you to check if an error has occurred during query execution Returns lsquotruersquo if an error has occurred and lsquofalsersquo if not

bull String getErrorDescription() Where errors have occurred this allows you to obtain a textual description of it Otherwise it returns null The custom error messages specified by the wrapper creator for the lsquoraise error handler (see [GENER]) in the Wrapper Generator Tool are accessed through this method

ITPilot 46 Developer Guide

ITPilot Development API 11

341 Canceling Queries

The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query

void cancel()

35 EXAMPLE OF USE

This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section

TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 16: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

ITPilot Development API 11

341 Canceling Queries

The following method from the class HTMLWrapperResultIterator can be used to cancel the execution of an ongoing query

void cancel()

35 EXAMPLE OF USE

This section shows a simple example of how to use the API The application starts connecting to an execution server installed in the lsquoacmersquo machine in port 9999 Next a reference to the wrapper called ldquoMoviesrdquo is obtained whose schema is the same used as an example in the preceding section

TITLE DIRECTOR EDITIONS FORMAT PRICE DESCRIPTION where TITLE and DIRECTOR are optional search fields Then a query is issued to the wrapper using the input parameter DIRECTOR with the value ldquoWoody Allenrdquo and the results are processed and shown in the standard output To process the results the hierarchical structure of ValueVO elements is navigated First the objects SimpleVO are obtained that represent the atomic fields TITLE and DIRECTOR Then the compound field EDITIONS which is represented by an object ArrayVO that contains an object RegisterVO for each edition of the film Each of these registers contains the atomic fields FORMAT PRICE and DESCRIPTION All atomic fields are of the type text except the field PRICE which is a double Finally any possible errors produced during execution are checked

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 17: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

ITPilot Development API 12

package comdenodoitpilotclient

import javautilList

import javautilHashMap

import javautilMap

import javautilIterator

import comdenodovdbvdbinterfacecommonclientResultvosentencesValueVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesSimpleVO

import comdenodovdbvdbinterfacecommonclientResultvosentencesArrayVO

import

comdenodovdbvdbinterfacecommonclientResultvosentencesRegisterVO import comdenodovdbvdbinterfaceclientprinterstandardStandardRowVO

public class ITPilotExample

public static void main(String args[])

try

Connect to server

HTMLWrapperServerProxy server = new HTMLWrapperServerProxy

(acme9999)

Get Wrapper

HTMLWrapperProxy wrapper = servergetHTMLWrapper(Movies)

Prepare query params

Map queryParams = new HashMap ()

queryParamsput (DIRECTORWoody Allen)

Execute query

HTMLWrapperResultIterator results = wrapperquery(queryParams)

Iterate results

int numOfTuples = 0

while (resultshasNext())

numOfTuples++

StandardRowVO tuple = (StandardRowVO) resultsnext()

Process each tuple

Systemoutprint(numOfTuples + )

Get and print atomic fields TITLE DIRECTOR

SimpleVO titleVO = (SimpleVO)tuplegetValue(TITLE)

String title = (String)titleVOgetValue()

Systemoutprintln(TITLE+ title)

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 18: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

ITPilot Development API 13

SimpleVO directorVO = (SimpleVO)tuplegetValue(DIRECTOR)

String director = (String)directorVOgetValue()

Systemoutprintln(DIRECTOR + director)

Get EDITIONS array

ArrayVO editionsVO = (ArrayVO)tuplegetValue(EDITIONS)

Iterate over EDITION registers

int numEditions=0

Iterator editions = editionsVOgetValues()iterator()

while (editionshasNext())

numEditions++

Systemoutprintln(EDITION + numEditions)

RegisterVO editionVO = (RegisterVO)editionsnext()

Map edition = editionVOgetValues()

SimpleVO formatVO = (SimpleVO)editionVOget(FORMAT)

String format = (String)formatVOgetValue()

Systemoutprintln(t FORMAT + format)

DoubleVO priceVO = (DoubleVO)editionVOgetValue(PRICE) Double price = priceVOgetDouble()

Systemoutprintln(t PRICE + price)

SimpleVO

descriptionVO=(SimpleVO)editionVOgetValue(DESCRIPTION)

String description = (String)descriptionVOgetValue()

Systemoutprintln(tDESCRIPTION + description)

Systemoutprintln()

Check errors

if (resultscheckErrors())

Systemoutprintln(Error + resultsgetErrorDescription())

catch(Exception e)

Systemerrprintln(Error trying to access server )

finally

Figure 1 Example of query execution to a wrapper

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 19: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 14

4 CREATING CUSTOM ITPILOT FUNCTIONS

Custom functions let users extend the set of functions available in ITPilot Custom functions are Java classes included in a Jar file that are added to ITPilot so they can be used in the same way as other functions such as MAX MIN SUM etc Denodo4E an Eclipse plug-in which provides tools for creating debugging and deploying Denodo extensions including custom ITPilot functions is included in the Denodo Platform Please read the README in $DENODO_HOMEtoolsdenodo4e for more information Each function must be in a different Java class but it is possible to group them in a single Jar We recommend developing custom functions using Java annotations although it is also possible to do it using name conventions Although custom functions can be created without dependencies on Denodo libraries the use of Java annotations is recommended The annotations and compound types and values required to create custom functions are located in

$DENODO_HOMElibcontribdenodo-customjar

These are the rules that every custom function must follow to work properly

bull Functions with the same name are not allowed If a jar contains one or more function with name conflicts nothing in that jar will be loaded in the server bull All custom functions stored in the same jar are added or removed together by uploadingremoving the jar in the server bull Each function can have many signatures Each signature is defined by an execution method in the Java class defining the custom function bull Functions can have arity n but only the last parameter of the signature can be repeated n times

A custom function is defined in a Java class containing all its implementation the name of the function will be extracted from that Java class A function can contain several signatures different combinations of arguments (different number types or both) For each signature of the function this class must define a Java method implementing the functionality of the function with those arguments and one additional method in case the signature returns a different type depending on the parameters or the return type is compound (array or register) When defining custom functions simple types are mapped directly from Java objects to Virtual DataPort data objects The following table shows how the mapping works and which Java types can be used

Java ITPilot javalangInteger int javalangLong long javalangFloat float javalangDouble double javalangBoolean boolean javalangString text javautilCalendar date byte[] binary

Equivalency between Java and ITPilot data types

Note The parameters of a custom functions cannot be basic types int long double etc

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 20: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 15

41 NAMING CONVENTIONS AND ANNOTATIONS

The following naming conventions allow the definition of some custom functions without the need of Java annotations even if it is recommended to use them All the names used in the naming conventions are case sensitive To make a Java class to recognizable as a custom function without Java annotations its name must match the following pattern

bull ltFunctionNamegt + ldquoItpFunctionrdquo This way a Java class named Concat_SampleItpFunction will be interpreted as a function named Concat_Sample All Java methods implementing the function signatures must have the name execute The signature associated with each method will be extracted from the Java method parameters For example a class named Concat_SampleItpFunction with a method execute(valueAString valueBString)String will generate the function signature CONCAT_SAMPLE(arg1text arg2text) To define a parameter with arity n in a custom function the last parameter has to be an array Eg the class Concat_SampleItpFunction with a method declared as public String execute(String hellip inputs) Custom functions which return type depends on the type of their input parameters or return an array or register can define an additional method with equivalent signature to the one of execute This additional method must be named executeReturnType The definition of this method is optional If it is not present the execute method will be called and the return type will be obtained from the results of the execution The advantage of defining the method executeReturnType is that in some cases calculating the return type is much less complex and time consuming than actually executing the function thus by providing this method the performance is improved Naming conventions only cover a subset of all the possible custom functions In order to prevent the limitations using naming conventions it is recommended to use the Java annotations provided by Denodo in the jar file $DENODO_HOMElibcontribdenodo-customjar These annotations are

bull comdenodocommoncustomannotationsCustomElement Class annotation used to define the class as a custom function The annotation requires the parameters

bull name name of the custom function

bull type In ITPilot it must be CustomElementTypeITPFUNCTION

bull comdenodocommoncustomannotationsCustomExecutor Method annotation used to specify the method as a function signature This method will be executed when using the function with the appropriate arguments The annotation has an optional variable syntax in order to specify the syntax of the function signature when presenting it to the user at the Wrapper Generation Tool bull comdenodocommoncustomannotationsCustomExecutorReturnType Method annotation used to specify the method as the one used to compute the return type of a function signature before executing a query

bull comdenodocommoncustomannotationsCustomParam Parameter annotation with the parameter name used to make more user friendly the auto generated syntax description of the signature If this annotation is not used the syntax will use the names arg1 arg2 etc to represent the input parameters

42 COMPOUND TYPES

Compound types and values in the custom functions are defined by the following Java classes

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 21: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 16

bull comdenodocommoncustomelementsCustomRecordType Class representing a register data type It stores the type name and a set of name-type pairs where the name is a string and the type is either a javalangClass of some of the Java classes used for simple types or a Denodo compound type (CustomRecordType or CustomArrayType)

bull comdenodocommoncustomelementsCustomRecordValue Class representing a register data value It stores a set of name-value pairs where the name is a string and the value is either an instance of a simple type (javalangString javalangInteger etc) or another compound value (CustomRecordValue or CustomArrayValue)

bull comdenodocommoncustomelementsCustomArrayType Class representing an array data type It stores the type name and an instance of CustomRecordType that defines the type of the elements of the array

bull comdenodocommoncustomelementsCustomArrayValue Class representing an array value It stores a list of CustomRecordValue instances

bull comdenodocommoncustomelementsCustomElementsUtil Helper class with methods to instantiate compound types and values if needed

43 PAGE TYPE

ITPilot custom functions can also receive a PageValue object in their arguments The type of this object is comdenodocommoncustomelementsCustomPageValue and it contains the URL of the last page method and POST parameters and the page cookies

44 CUSTOM FUNCTION RETURN TYPE

As explained before custom functions which return type depends on input values or functions returning compound types can implement an additional method in order to compute the return type without executing the function This is entirely optional but it provides better performance when the execution of the function is slower or more memory intensive than the return type calculation This additional method must follow a few rules

1 When the execute method returns a non-constant compound type (a record whose fields -number of fields and their names andor types- depend on the input parameters) or a javalangObject then the additional method must be implemented In other situations it is optional (the return type is obtained from the method directly) 2 The execution method must have the same number of parameters as the additional method 3 Each parameter of the additional method must have the same or equivalent type as its respective parameter in the execute method If the execute method returns a basic Java type the additional method has to return the same basic Java class Ie If the execute method returns a String object the additional method has to return javalangStringclass If the execute method returns a CustomRecordValue object the additional method has to return a CustomRecordType object If the execute method returns a CustomArrayValue object the additional method has to return a CustomArrayType object

See table lsquoEquivalency between Java and ITPilot data typesrsquo at the beginning of section 4 to know the type that these return parameters will have in ITPilot

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 22: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Creating Custom ITPilot functions 17

45 EXAMPLE

Example of a function with annotations that returns an array SPLIT which splits strings around matches of a given regular expression and returns the array of these substrings import comdenodocommoncustomannotations import comdenodocommoncustomelements import javautil CustomElement(type=CustomElementTypeITPFUNCTION name=SPLIT_SAMPLE) public class Split private static final String STRING_FIELD = string CustomExecutor() public CustomArrayValue split_sample(CustomParam(name=regexp)String regex CustomParam(name=valuer)String value) if(value == null || regex == null) return null String []result = valuesplit(regex) LinkedHashMapltString Objectgt results = new LinkedHashMapltString Objectgt(1) ListltCustomRecordValuegt arrayValues = new ArrayListltCustomRecordValuegt(resultlength) for (String string result) resultsput(STRING_FIELD string) CustomRecordValue recordValue = CustomElementsUtilcreateCustomRecordValue(results) arrayValuesadd(recordValue) return CustomElementsUtilcreateCustomArrayValue(arrayValues) CustomExecutorReturnType public CustomArrayType split_sampleReturnType(String regex String value) LinkedHashMapltString Objectgt props = new LinkedHashMapltString Objectgt() propsput(STRING_FIELD Stringclass) CustomRecordType record = CustomElementsUtilcreateCustomRecordType(props) CustomArrayType array = CustomElementsUtilcreateCustomArrayType(record) return array

Figure 2 ITPilot Custom Function Sample

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 23: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 18

5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT

51 INTRODUCTION

Although Denodo provides a graphical component-based wrapper generation tool that enables the creation of wrapper programs to access semi-structured sources (web Adobe PDF or Microsoft Word) with no need for development ITPilot allows the user to generate hisher own wrappers in a complete manner by means of the JavaScript programming language The JavaScript version supported by Denodo ITPilot is 15 which is compliant with the ECMA 30 standard [ECMA262] The following sections assume some previous basic knowledge of the JavaScript language Section 52 will introduce the JavaScript representation format of the ITPilot wrappers This will allow to understand how to interact in a wrapper with the predefined ITPilot components in section 53 and how to develop complete JavaScript wrappers by following the indications shown in section 541

52 REPRESENTATION FORMAT OF A WRAPPER

An ITPilot wrapper is structured in JavaScript as it is shown in Figure 3

function getInit() var start = new Init() startsetText(INITPARAM OBLIGATORY) return start function getOutputSchema() var structureOutput = new Record_Structure(OUT_REC) structureOutputsetText(ATTRIBUTE_1) structureOutputsetText(ATTRIBUTE_2) structureOutputsetText(ATTRIBUTE_3) return structureOutput function main()

Figure 3 ITPilot Wrapper Skeleton in JavaScript

There are three possible functions in each script one mandatory and two optional ones

1 main() function it is the only mandatory one and contains the component implementation 2 getInit() function this must be used to return the set of searchable parameters 3 getOutputSchema() function this function is used to return the structure of the output objects if they exist1

The functions are somehow linked with the definition of the process as components with the input parameters defined in the Initialization component and the output record defined just as it is received by the output component

1 Since version 40SP1 this function previously known as getMetadata has been renamed to getOutputSchema There is backwards compatibility but the use of the new name is strongly recommended

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 24: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 19

521 Initialization of Searchable Parameters

This function is used to describe the input parameters of the ITPilot wrapper In the example the first line of the function var start = new Init() is the one responsible for creating a new parameter initialization object This object is described further on in section 53 (the Component Catalog)

522 Main Function

This is the place where the wrapper business logic is developed In this function different object instances are created each of which represents an ITPilot component either predefined or custom (see [GENER] for more information about how to create custom component with ITPilot) The published functions for every ITPilot predefined component are described and explain in section 53

523 Generating the Output Structure

This is the function that determines if it exists which is the wrapperrsquos output structure The structure is a data record implemented by the RecordStructure object and defined in the section 53 catalog

53 PREDEFINED ITPILOT COMPONENT GUIDE

531 Introduction

This chapter provides the list of pre-defined ITPilot components Each component is represented as an instantiable object in JavaScript with a series of functions that are described and explained below NOTE Some of the parameters used in the described functions can be omitted (by invoking the method with fewer input arguments) A parameter can not be omitted if the value of another input argument at its right has to be defined When a parameter is optional its default value will be indicated in the function description For example for the object RECORD_STRUCTURE (see section 5321) rssetText(FIELD) is equivalent to rssetText(FIELD OPTIONAL) rssetText(FIELD OBLIGATORY) is not valid The following must be used rssetText(FIELD OBLIGATORY)

532 Data Structures

ITPilot defines List and Record (a data record defined by the Record Structure object) as data structures The following sections will define them

5321 Record Structure

bull Object Record_Structure

bull Description This represents a data structure that allows the definition of the structure of a specific record This is often used in the getOutputSchema() function of the wrapper (see 523)

bull Functions

o Constructor(name)

bull name name of the structure

o setText(field regexp type) creation of a new character string field in the record

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 25: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 20

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull type (optional) defines whether the parameter is mandatory or not By default it is assumed that the field is optional

o setLink(field type) new Link-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setInt(field type) creation of a new Integer-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBoolean(field type) creation of a new boolean-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setLong(field type) creation of a new Long-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setFloat(field type) this creates a new Float-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setDouble(field type) creation of a new Double-type field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setBlob(field type) creation of a new BLOB-type (Binary Large Object) field in the record

bull field name of the new field

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 26: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 21

o setDate(field regexp format type) creation of a new Date-type field in the record

bull field name of the new field

bull regexp (optional) regular expression of the character string generation By default if no constraint exists its value is ldquordquo

bull format (optional) date format following [DATEFORMAT] By default its value is d-MMM-yyyy Hh mm ss

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setRegister(record type) creation of a new Record-type field in the record

bull record record name

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o setArray(name structure type) creation of a new Array-type field in the record

bull name name of the array

bull structure data structure that represents the record structure contained in the array

bull type (optional) defines whether the parameter is mandatory or not By default the field is optional

o toString() This transforms the record into a string of characters for their representation

When a custom component is created (see section 54) from an ITPilot wrapper program a Record Structure is defined to represent the input values to the custom component

NOTA to assign values to the fields of a record the RECORD_CONSTRUCTOR as explained in section 5322 must be used except in the cases of Text Integer Float and Link-type fields for which specific functions apply

5322 Record List

bull Object List

bull Functions

o setListName(listName) name of the list

bull listName name of the list

o add(obj) addition of an element to the list

bull obj element to add

o toArray() transforms the list into a JavaScript object array

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 27: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 22

533 Common functions

Some of these functions are common to all or almost all components and are therefore shown in this first section The catalog explains the components that do not contain some of the ldquocommonrdquo functions

5331 onError function

bull onError(errorId errorAction) This informs the component of its behavior in the event of any type of error The onError function can be invoked several times with different errorId parameter values

o errorId This indicates the type of error for which the behavior is to be managed The possible values are

bull RUNTIME_ERROR error while the component is being run

bull CONNECTION_ERROR error that occurs when there is some kind of connection problem with the Web source

bull HTTP_ERROR error produced by an http error

bull TIMEOUT_ERROR This error is caused if the Web source takes time in answering The waiting time is configurable Where the wrapper is used in the run environment this parameter is configured in the browser pool used (see [USER]) In the generation environment in question this value is configured in the ITPAdminConfigurationproperties file available in ltDENODO_HOMEgtconfitp-admin-tool with the property IEBrowserMAX_DOWNLOAD_TIME1 for Internet Explorer IEBrowserMAX_DOWNLOAD_TIME2 for Firefox and IEBrowserMAX_DOWNLOAD_TIME3 for http browser

bull SEQUENCE_ERROR error produced when there is a problem with the sequence (the sequence is not correctly written or some command could not be run etc)

o errorAction action to be taken when the error indicated in the previous parameter arises The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run In general the components having any kind of return value with return ldquonullrdquo in case there is an error except in the following cases FILTER (5313) and RECORD CONSTRUCTOR (5322) In the cases of LOOP (5319) REPEAT (5325) and CONDITION (535) even though they return ldquonullrdquo it will be evaluated as ldquofalserdquo if they are used in a condition expression

bull ON_ERROR_RETRY rerun the wrapper The number of retries and time between retries are configured in each parameter

bull ON_ERROR_RETRY_IGNORE rerun the wrapper as with the ON_ERROR_RETRY error type but continuing with the wrapper execution in case the error is still happening after the retries

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 28: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 23

5332 debugLevel function

bull debugLevel(level) This allows for the trace level to be used when running this component to be indicated The possible levels are defined as numbers from 0 to 5 where 0 means that no message will be written to the log trace and 5 means that all message types will be written to the log trace file The log types are the following

o TRACE

o DEBUG

o INFO

o WARN

o ERROR

o FATAL

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 29: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 24

534 Add Record To List

bull Object Add_Object_To_List

bull Description adds a record to a list

bull Functions

o Constructor()

o exec(record list) executes the function

bull record record to be added to the list

bull list list to which the record is added

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 30: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 25

535 Condition

bull Object Condition

bull Description allows a condition to be defined Two output connections determine the process flow depending on whether the condition is met or not

bull Functions

o Constructor(expr)

bull expr this parameter defines the condition expression It is expressed as a string of characters (eg MyCondition = new Condition(($0 lt= $1) indicates that of the list of elements passing to the component in the exec function the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(elements) main function of the Condition component This carries out the condition operation returning ldquotruerdquo or ldquofalserdquo depending on whether the condition described in the constructor is met when applied to the input parameter elements

bull elements this parameter which must be in format ldquo[ELEMENT1 ELEMENT2hellip ELEMENTN]rdquo determines the elements on which the condition is made

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 31: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 26

536 Create List

bull Object Create_List

bull Description creates an empty list

bull Functions

o Constructor(listname) creates an empty list

bull listname name of the list of records to be created

o exec() runs the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 32: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 27

537 Create Persistent Browser

bull Object Create_Persistent_Browser

bull Description creates a persistent browser that is a browser that is kept running and active after the execution of the wrapper that initiated it

bull Functions

o Constructor() creates a persistent browser and returns its handler

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 33: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 28

538 Diff

bull Object Diff

bull Description the Diff component allows comparing two pages returning the differences between them regarding the retrieved HTML code

bull Functions

o Constructor(additionPrefixLabel additionSuffixLabel deletionPrefixLabel deletionSuffixLabel tokenSeparator)

bull additionPrefixLabel prefix to use when generating the result page for the new content (by default green background HTML tag)

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML end tag)

bull tokenSeparator indicates the character string used as HTML page element separator when the result page is generated so that each one of them can be adequately identified

o diff (baseCode finalCode) returns ldquotruerdquo if both pages are identical ldquofalserdquo if they are different

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o exec (baseCode finalCode) executes the Diff component returning a character string that represents the HTML content of those pages pointing out the differences between them

bull baseCode character string with the source page content

bull finalCode character string or page object with the target page content

o setAdditionPrefixLabel (additionPrefixLabel) modifies the additional data starting tag

bull additionPrefixLabel prefix to use when generating the result page for new content (by default green background HTML tag)

o setAdditionSuffixLabel(additionSuffixLabel) modifies the additional data ending tag

bull additionSuffixLabel suffix to use when generating the result page for the new content (by default green background HTML end tag)

o setDeletionPrefixLabel(deletionPrefixLabel) modifies the deleted data starting tag

bull deletionPrefixLabel prefix to use when generating the result page for the deleted content (by default red background HTML tag)

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 34: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 29

o setDeletionSuffixLabel(deletionSuffixLabel) modifies the deleted data ending tag

bull deletionSuffixLabel prefix to use when generating the result page for the deleted content (by default red background HTML endtag)

o setNullWhenEquals(nullWhenEquals) if the result page is identical to any of the two input pages the component will return ldquonullrdquo instead of the page itself

bull nullWhenEquals ldquotruerdquo implies that ldquonullrdquo will be returned when both pages are equal ldquofalserdquo means that the result page will be returned

o setIgnoreTagAttributes(simplifyTags) the component will not take into account the HTML tag attributes when comparing both pages

bull simplifyTags ldquotruerdquo means that the HTML tag attributes will be ignored With ldquofalserdquo they will not be ignored

o setCaseInsensitive (toLowerCase) used to establish whether the capitalization will be taken into account when comparing the pages

bull toLowerCase ldquotruerdquo transforms all HTML content to lower case ldquofalserdquo keeps the content as is

o setShowRemovedContent(mergedDeletions) whether the delete content is shown in the result page or not

bull mergedDeletions ldquotruerdquo the delete content will be shown If the value is ldquofalserdquo the configuration of the functions setDeletionPrefixLabel and setDeletionSuffixLabel will not be taken into account

o addTokenReplacement(replacement) allows the addition of a regular expression to a list These regular expressions can be applied on HTML tokens of the source pages before comparing them

bull replacement Perl [PERL] regular expression

o addIgnoredToken(regexp) allows the addition of a regular expression to the list These regular expressions can be applied on HTML tokens of the page Those that match the regular expression will be discarded before starting the comparison

bull regexp Perl [PERL] regular expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 35: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 30

539 ExecuteJS

bull Description ITPilot provides a component called ExecuteJS that lets the user execute a JavaScript expression as part of a navigation sequence This component is transformed into a Sequence command (see section 5327) that executes the ExecuteJS NSEQL command (see [NSEQL])

var Execute_JavaScript_1 = null var Execute_JavaScript_1_output = null Execute_JavaScript_1 = new SEQUENCE(sequenceExecuteJS(ltJavaScript code heregt) SEQUENCE_IEBROWSER) Execute_JavaScript_1onError(RUNTIME_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(CONNECTION_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(SEQUENCE_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(HTTP_ERROR ON_ERROR_RAISE) Execute_JavaScript_1onError(TIMEOUT_ERROR ON_ERROR_RAISE) Execute_JavaScript_1setRetries(3) Execute_JavaScript_1setRetryDelay(3000) Execute_JavaScript_1_output = Execute_JavaScript_1exec([])

Figure 4 Using the ExecuteJS NSEQL command

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 36: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 31

5310 Expression

bull Object Expression

bull Description allows an expression to be defined (based on constants andor use of functions provided by ITPilot) that will be assessed at an output value

bull Functions

o Constructor(expression)

bull expression object that defines the condition expression This object is expressed as a string of characters (eg MyCondition = new CONDITION(($0 lt= $1) indicates that of the list of elements passing to the component in the exec method the value of the first must be less than or equal to the value of the second) To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

o exec(exprInput) method running the component and returning the value resulting from the expression indicated in the component constructor

bull exprInput list of zero or more values zero or more records or zero or more record lists that are used as part of the expression

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 37: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 32

5311 Extractor

bull Object Extractor

bull Description this is responsible for extracting structured data from an HTML page thus generating a DEXTL program ([DEXTL])

bull Functions

o Constructor(name page specification structure)

bull name name of the Extractor component instance

bull page page-type ITPilot structure from where data is to be extracted

bull specification DEXTL data extraction specification (see [DEXTL])

bull structure name of the record (previously created) that will be used to return the data extracted by the specification

o exec() main extractor method running the specification indicated in the constructor This function returns a list of records of the type defined in the constructor in the structure parameter

o setMergePatterns(merge) This applies the technique of merging patterns for greater system optimization (see [GENER] for further information)

bull merge Boolean parameter ldquotruerdquo if the pattern merge technique is to be applied or ldquofalserdquo if not This is ldquotruerdquo by default

o setI18n(i18n) Function that updates the process internationalization

bull i18n type of internationalization to use ITPilot provides different types of internationalization options such as ES_EURO US_PST GB and so on See [GENER] for more information about internationalization in ITPilot

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 38: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 33

5312 Fetch

bull Object Fetch

bull Description this obtains the contents of the URL or page used as the input argument and returns them in binary or text format

bull Functions

o Constructor(url sequenceType reusableConnection binary page)

bull url URL where the resource to be downloaded can be found (OPTIONAL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection This indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull binary ldquotruerdquo The object is binary ldquofalserdquo The object to be downloaded is in text format

bull page Optionally the page from which the http request is launched can be indicated

o exec(page) This runs the component returning the string- or binary-type value obtained

bull page Optionally the page from which the http request is launched can be indicated

o setEncoding(encoding) allows the user to determine the MIME type [MIME] of the information to send

bull encoding MIME type of the information to send

o syncWithPost(flag) this function lets the user set the method for recovering the page state ITPilot will send a POST message to the page URL with the POST parameters that were used to initially access that page This is the default synchronization method

bull flag ldquotruerdquo means that this synchronization function must be used If it is lsquofalsersquo ITPilot checks whether a back sequence exists or not defined by the setBackSequence function if it does not exist ITPilot executes a Back() NSEQL command

o setBackSequence(back) this function lets the user optionally set an explicit browse sequence to the page it comes from which more information extraction operations are going to be executed against

bull back back sequence NSEQL program

o setReusingConnection(reusingConnection) this function indicates whether connections will be reused or not

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 39: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 34

bull reusingConnection if the value is set to ldquotruerdquo the connection coming from previous components is reused if set to lsquofalsersquo a new browser will be launched importing information from the previous session

o setBackPages(pages) this function determines the number of pages ITPilot can go back when a Back() NSEQL command is being executed if neither back sequence has been defined nor has been defined as a POST navigation

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 40: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 35

5313 Filter

bull Object Filter

bull Description this carries out a filtering operation from a list of records returning those meeting a given condition

bull Functions

o Constructor(expr auxiliaryRecords)

o expr regular expression of the filtering operation for a list of records which are described in the exec function

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

o exec(inputRecords auxiliaryRecords) function receiving a list of records and returning the subgroup complying with the selection expression indicated in the constructor

o inputRecords list of input records

bull auxiliaryRecords record list that participates in the filter condition but which are not the records to filter

NOTE If the error handler or this component is set to ON_ERROR_IGNORE FILTER will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 41: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 36

5314 Form Iterator

bull Object Form_Iterator

bull Description this allows a run loop to be generated for a specific form where predetermined values for each of the fields included are used in each run

bull Functions

o Constructor(findForm submitForm sequenceType reusableConnection baseElements inputPage parallelIterator)

bull findForm NSEQL program that allows for the form to be used as the basis of the iteration to be found (see [NSEQL] for further information on NSEQL)

bull submitForm NSEQL program that allows for the form to be invoked (see [NSEQL] for further information on NSEQL)

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull baseElements optional list of records that can be employed as variables to use in the different NSEQL browsing sequences used in this component

bull inputPage input page from which the selected form can be iteratively invoked

bull parallelIterator ldquotruerdquo the component will execute its iterations in parallel

o selectMultiplePositions(field position positionsArray clickedArray) indicates what positions are selected in a multiple selection field in the target form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 42: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 37

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectMultipleTexts(field position valuesArray positionsArray equalsArray clickedArray) this indicates the values selected from a multiple selection field for the chosen form

bull field name of the multiple selection field

bull position position related to the field between those of the same name starting with position 0

bull valuesArray list of values that must be selected in the field

bull positionsArray list that indicates the position held for each valuesArray element in the event of replicated values

bull equalsArray list that indicates whether the value of each valuesArray element must be identical to that appearing in the selection field (equals = true) or contained therein (equals = false)

bull clickedArray list that indicates whether each valuesArray element can be marked not marked or both There are certain JavaScript constants defined for this

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o selectPositions(field position positions) this indicates the values selected from a selection field for the chosen form

bull field name of the HTML selection field

bull position position occupied in the event of more than one field element with the same name

bull positions values of the elements on which the component must iterate

o selectTexts(field position values positions equal) this indicates the values to be used in the different iterations on a text field

bull field name of the HTML text field

bull position position of the field in the event of several on the form with the same value

bull values list of values that must be selected in the field

bull positions list that indicates the position held for each value element in the event of replicated values

bull equals boolean value which indicates if the field values must exactly match those provided by the function and might be contained

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 43: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 38

o click(field value state) function that allows for an element to be selected and a ldquoclickrdquo event run on it

bull field name of the HTML field on which the click is to be made

bull value when this function is run on Radio Buttons this parameter indicates the elements selected as a list (eg [0 1]) When run on Checkboxes it indicates the value of the selectable element

bull state when this function is run on Radio Buttons this parameter is not used When run on Checkboxes it indicates the status of the element

bull CLICKED_ELEMENT mark the element

bull NON_CLICKED_ELEMENT leave the element as unmarked

bull CLICKED_AND_NON_CLICKED_ELEMENT generates two combinations one with the element marked and another with the element unmarked

o input(field position values) function that indicates the values added to an input field

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o textarea(field position values) this indicates the values added to a text area

bull field name of the HTML input field

bull position position of the field in the event of several on the form with the same name

bull values list of values that must be selected in the field

o toList() returns the list with the NSEQL sequences used in each iteration

o setMaxIterations(count) sets the maximum number of iterations that can be executed

bull count number that determines the maximum number of iterations

o setRetries(count) update method for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o setParallelIterator(flag) the component launches the iteration in parallel

bull flag ldquotruerdquo the iterations will be executed in parallel

o next(inputPage) this returns the page resulting from running a component iteration

bull inputPage optional parameter that allows for a new starting page to be indicated on which a new component iteration is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 44: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 39

o hasNext() function that determines whether there are more results The function returns ldquotruerdquo if there is at least one more result or ldquofalserdquo if there is not

o close() function that closes the iterator

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization method

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence function If there is not an NSEQL Back() command is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 45: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 40

5315 Get Page

bull Object Get_Page

bull Description obtains an active browser from the browser pool from a previously retrieved identification code

bull Functions

o Constructor(browserUuid) obtains (or optionally creates) the handler to an active browser from its identification

bull browserUuid browser id

o exec(pageType lastURL lastURLMethod lastURLPostParameters cookie proxyUser proxyPassword proxyDomain) executes the component and returns a Page object with information about the browserrsquos current state It is possible to execute the function with no parameters for later browsing by using a Sequence object (see section 5327)

bull pageType type of browser used to access the page

bull SEQUENCE_IEBROWSER = 1

bull SEQUENCE_HTTP_BROWSER = 2

bull lastURL last URL where the page is coming from

bull lastURLMethod access method (GET POST) of the URL the page is coming from

bull lastURLPostParameters POST-method parameters of the URL the page is coming from

bull cookie information storage ldquocookiesrdquo

bull proxyUser user name to access the Proxy if required

bull proxyPassword user password to access the Proxy if required

bull proxyDomain Proxy domain if required

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 46: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 41

5316 Init

bull Object Init

bull Description is responsible for storing the structure of the input data which is the data that the wrapper will receive from the calling application

bull Functions

o Constructor(input output)

bull input input record of the component Optionally used only when custom components are created (see section 54) In the case of standard processes ITP takes this information from the JavaScript context

bull output name of the output record of the component which represents the query parameters of the wrapper Its use is optional in the standard process main function if not specified the record will be generated at runtime (with the exec() function)

o get(name) this returns the value of a record field created as a group of initialization parameters

bull name name of the record field

o setText(field obl fixedValue) this creates a text-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setInt(field obl fixedValue) this creates an integer-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 47: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 42

o setLong(field obl fixedValue) this creates a long-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setFloat(field obl fixedValue) this creates a floating-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDouble(field obl fixedValue) this creates a double-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBlob(field obl fixedValue) this creates a BLOB-type (binary large object) field in the initialization record

bull field name of the field to create

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 48: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 43

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setBoolean(field obl fixedValue) this creates a Boolean-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setLink(field obl fixedValue) this creates a URL-type field in the initialization record

bull field name of the field to create

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setDate(field format obl fixedValue) this creates a date-type field in the initialization record

bull field name of the field to create

bull format representation format of the date field This format is optional but becomes compulsory if completed Otherwise the wrapper may not be run This representation format is defined in [DATEFORMAT]

bull obl parameter that indicates the compulsory nature of the query on the field to create The possible values are

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 49: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 44

bull OPTIONAL (default value) The parameter is optional it does not need a value to be assigned in each wrapper query

bull OBLIGATORY The parameter is obligatory in any query made on the wrapper

bull FIXED the parameter has a constant value this value is assigned by the fixedValue parameter described below

bull fixedValue optional parameter that indicates a constant value assigned to the field

o setName(name) update function for the component name

bull name new component name

o setI18n(i18n) function which updates the process i18n

bull i18n type of internationalization to be used ITPilot provides different types of i18n configurations such as ES_EURO US_PST GB etc See [GENER] for more information about internationalization in ITPilot

o exec() main function for running the component returning a record representing the wrapper initialization parameters

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 50: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 45

5317 Iterator

bull Object Iterator

bull Description component that iterates on a list of records one by one

bull Functions

o Constructor(list)

bull list list of records on which to iterate

o hasNext() this determines whether there are more results on which to iterate ldquotruerdquo is returned if there is at least one more result

o next() this returns the next iteration element The list is a sorted sequence of records

The ldquoParallel Executionrdquo option existing in the ITPilot graphic interface becomes the next JavaScript structure using the Thread object described in section 5329

var _thread0 = new Thread() while(iteratorhasNext()) recordInstance = iteratornext() _thread0execute(_functionIterator_1 structureInstance recordInstance)

Figure 5 Using threads in the Iterator component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 51: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 46

5318 JDBCExtractor

bull Object JDBCExtractor

bull Description These functions allow sending a query to any source available via JDBC and return a record list with the obtained results

bull Functions

o Constructor (uuid uri driver userName password structure baseRecords maxPoolSize initialPoolSize checkQuery query)

bull uuid component unique identifier

bull uri connection URL to the database

bull driver driver class to use to connect to the data source

bull userName user name

bull password user password

bull structure structure of the componentrsquos output record list It is defined as a record of values

bull baseRecords record list to be used

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull checkQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

bull query SQL query that returns the results required by the component

o exec(query baseRecords) executes the JDBCExtractor component

bull query SQL query that returns the results required by the component

bull baseRecords record list to be used

o setPoolConfig(maxPoolSize initialPoolSize pingQuery) updates the pool configuration

bull maxPoolSize maximum number of connections that can be manager by the browser pool at the same time

bull initialPoolSize initial number of browser pool connections A number of idle connections as established ready to be used

bull pingQuery SQL query used by the pool to verify the status of the currently cached connections It is required that the query is simple and that the queried table exists

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 52: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 47

o disablePool() disables the connection pool

o addDriverProperty(propname propvalue) adds a JDBC driver property

bull propname property name

bull propvalue property value

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 53: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 48

5319 Loop

bull Description This allows loops to be made in the flow The loop will be repeated as long as the given condition is met (WHILEhellip DO) The loop component is implemented in JavaScript using a while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var loop = null loop = new Condition(ltoutput_conditiongt) looponError(RUNTIME_ERROR ON_ERROR_RAISE) while(loopexec([])) ltloop operationsgt hellip

Figure 6 Using the Loop function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 54: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 49

5320 Next Interval Iterator

bull Object Next_Interval_Iterator

bull Description this allows for iteration by different inter-related pages by one or by different browsing sequences

bull Functions

o Constructor(sequences iterations sequenceType reuse inputPage)

bull sequences list of browsing sequences to use If there is only one sequence it will try to use it in all iterations If there is more than one sequence it will use one in each iteration

bull iterations this indicates for every sequence the number of iterations to be made the size of this list must be equal to the size of the list provided in the sequences parameter This parameter is only valid when a single browsing sequence is indicated for use in the sequences parameter

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information

bull inputPage this indicates the page from which the next browsing sequence is to be made

o next(inputRecords inputPage) this returns the next iteration element

bull inputRecords list of input records that can be used as parameters within the browsing sequences at the next interval

bull inputPage this indicates the page from which the next pages are to be accessed

o close() this closes the iterator

o setRetries(count) this configures the number of retries in the event of error in accessing the next page

bull count number of retries

o setRetryDelay(count) this configures the interval between two retries

bull count interval in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 55: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 50

o syncWithPost(flag) this function indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function is to be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() method is run

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation

bull 1 Internet Explorer browser implementation

bull 2 Firefox browser implementation

bull 3 HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 56: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 51

5321 Output

bull Object Output

bull Description this places a record in the wrapper output

bull Functions

o Constructor(structure)

bull structure parameter that indicates the component input record to be used as the wrapper result

o add(record) this allows for the component input record to be used as the wrapper result to be subsequently added

bull record record to use

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 57: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 52

5322 Record Constructor

bull Object Record_Constructor

bull Description this allows a record to be constructed using other records generated in the flow as well as generating new attributes derived from already existing ones

bull Functions

o Constructor(recordsObj name)

bull recordsObj list of input elements Each element from the list can be a record or a list of records

bull name name of the output record of the Record Constructor component

o add(fieldName expression errorAction) method for adding a new field to the record under construction

bull fieldname name of the field

bull expression field definition expression eg ldquo$0PARAM1rdquo indicates that the field will contain the field PARAM1 from the first input record of the recordsObj list entered in the constructor To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

bull errorAction action to be run in the event of it not being possible to assess the expression correctly The possible values are

bull ON_ERROR_RAISE stop wrapper run indicating the source of the error

bull ON_ERROR_IGNORE ignore the error continuing with the wrapper run

o exec() this runs the Record Constructor component instance returning an object that represents the record obtained

NOTE If the error handler or this component is set to ON_ERROR_IGNORE RECORD CONSTRUCTOR will return the list of filtered elements except for the one that caused the error

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 58: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 53

5323 Record Sequence or Extractor Sequence

bull Object Record_Sequence

bull Description This creates a browsing sequence created from the results of a record It allows sequences to be created for access to other pages from pages processed by the Extractor component

bull Functions

o Constructor(sequences sequenceDepends sequenceType reuse inputPage)

bull sequences ordered and sequential list of the NSEQL browsing sequences to be used by the component

bull sequenceDepends ordered and sequential list of the DEXTL tags associated with each NSEQL browsing sequence from the sequences list

bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reuse Boolean value that indicates whether the browser used to date is reused or whether a new browser is launched maintaining the sessionrsquos information In general this value will be ldquotruerdquo although in some cases it may not be a good option if the previous iterator is run in parallel to it

bull inputPage optional this allows for a homepage to be indicated

o exec() this returns a page object that represents the target page of the browsing sequences

o All of the methods offered by the Sequence component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 59: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 54

5324 Release Persistent Browser

bull Object Release_Persistent_Browser

bull Description accepts a browser id or a page as browser identifier and releases that specific browser

bull Functions

o Constructor(page)

bull page page loaded on the browser that is going to be released

o Constructor(browserUuid)

bull browserUuid browser identifier

o exec() executes the component

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 60: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 55

5325 Repeat

bull Description This allows for loops to be made in the flow The loop is repeated until the given condition is met (REPEAThellip UNTIL) The Repeat component is implemented in JavaScript using a dohellip while loop with a Condition object used as the loop output condition The Condition object is defined in section 535 To define the condition expression ITPilot provides a set of functions defined in Appendix A of [GENER]

var repeat = null repeat = new Condition(ltoutput_conditiongt) repeatonError(RUNTIME_ERROR ON_ERROR_RAISE) do ltloop_operationsgt hellip while(repeatexec([]))

Figure 7 Using the Repeat function

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 61: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 56

5326 Script

bull Description The component allows for part of the description logic of an ITPilot wrapper to be written in JavaScript This component has no specific JavaScript function associated When this component is used from the generation graphic interface it becomes a JavaScript function that is invoked from the place held within the process flow

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 62: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 57

5327 Sequence

bull Object Sequence

bull Description This creates a browsing sequence in NSEQL language (see [NSEQL])

bull Functions

o Constructor(sequence sequenceType reusableConnection inputPage)

bull sequence NSEQL browsing program (see [NSEQL]) bull sequenceType type of pool to use The possible values are

bull SEQUENCE_IEBROWSER

bull SEQUENCE_HTTP_BROWSER

bull SEQUENCE_FTP

bull SEQUENCE_LOCAL

bull reusableConnection this indicates whether the connection will be reused (ldquotruerdquo) or not (ldquofalserdquo) See [GENER] for further information

bull inputPage optional parameter this indicates the starting page If not the NSEQL program is run directly

o exec(inputValues inputPage) this runs the Sequence component returning the last page that the browsing sequence has reached

bull inputValues list of values that can be used as input parameters within the browsing sequence

bull inputPage optional parameter this describes the page from which the component browsing sequence is run

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

o close() this closes the connection with the running browser

o syncWithPost(flag) this method indicates whether to retrieve the status of the page a POST message must be issued to the page URL containing the POST parameters with which it arrived This is the default synchronization function

bull flag ldquotruerdquo indicates that this synchronization function must be used If it is ldquofalserdquo ITPilot checks whether there is a back sequence defined with a setBackSequence method If there is not an NSEQL Back() command is run

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 63: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 58

o setBackSequence(back) this function optionally allows for a browsing sequence explicit to its source page to be indicated for more data extraction operations to be carried out

bull back NSEQL back program

o setReusingConnection(reusingConnection) this indicates whether the connection will be reused or not

bull reusingConnection if ldquotruerdquo the connection from previous components will be reused With the parameter set to ldquofalserdquo a new browser is opened and the data imported from the previous session

o setBackPages(pages) determines the number of pages that ITPilot must browse back when the NSEQL Back() command must be run because no back sequence has been explicitly defined nor a post navigation has been configured as back sequence

bull pages number of back pages

o toString() this returns the NSEQL (see [NSEQL]) sequence

o setBrowserType(browserType) this function determines the browser implementation to use in the component The accepted values are

bull 0 default browser implementation bull 1 Internet Explorer browser implementation bull 2 Firefox browser implementation bull 3 Denodo HTTP browser implementation

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 64: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 59

5328 Store File

bull Object StoreFile

bull Description this stores the contents entered as the input parameter in a file

bull Functions

o Constructor(content file)

bull content string- or binary-type value that indicates the contents to be stored A page value is also supported as input In that case the page content will be stored

bull file path and name of the file where the contents are to be stored

o exec() runs the component

o setGenerateFilename(generate) this function determines if the output file name should be automatically generated when the input file is null or is a directory

bull generate indicates if the file name should be automatically generated

o setRetries(count) update function for the number of retries in the event of failures

bull count number of retries

o setRetryDelay(mseconds) this allows for the waiting time between retries to be indicated

bull mseconds this indicates the waiting time between retries in milliseconds

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 65: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 60

5329 Thread

bull Object Thread

bull Description this represents a Thread in the ITPilot wrapper It is often used when the subsequent processing on each of the records obtained in an extraction operation is carried out concurrently

bull Functions

o wait() This causes the thread to enter standby until all executions invoked with the function execute have been finished

o execute(functionName ltlist of argumentsgt) this launches the run thread on the described function

bull functionName name of the JavaScript function to be run

bull ltlist of argumentsgt list of arguments separated by commas which must match the arguments of the JavaScript function

o setMaxConcurrentThreads(int) allows to configure the maximum number of Thread instances that will be used in parallel Later requests will be queued until the ongoing executions finish

bull int maximum number

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 66: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 61

54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS

541 Developing Custom Components

Custom components can be graphically developed by using the wrapper generation tool (see [GENER]) but they can also be developed in JavaScript To achieve it a file with js suffix must be created and stored in the path ltDENODO_HOMEgtmetadataitp-custom-components with the following functions

bull mycustom_main(mycustom_input) var mycustom_output = null hellip return mycustom_output

o This is the main function where ldquo mycustomrdquo is the name of the custom component

bull mycustom_getInputStructure() hellip

o This function allows to define the input schema

bull mycustom_getOutputType() return ltTYPEgt

o This is the function that defines the component output type The possible values are

bull LIST_TYPE = 1

bull PAGE_TYPE = 2

bull RECORD_TYPE = 3

bull SIMPLE_TYPE = 4

bull ARRAY_TYPE = 5

bull BINARY_TYPE = 6

bull BOOLEAN_TYPE = 7

bull DATE_TYPE = 8

bull DOUBLE_TYPE = 9

bull FLOAT_TYPE = 10

bull INT_TYPE = 11

bull LONG_TYPE = 12

bull STRING_TYPE = 13

bull URL_TYPE = 14

bull BROWSER_ID_TYPE = 15

bull mycustom_getOutputStructure) hellip

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 67: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

Developing ITPilot Wrappers with JavaScript 62

o This function is responsible for defining the output structure that will be returned by the component It is necessary only when the output type defined by the function myCustom_getOutputType is of type RECORD_TYPE or LIST_TYPE

542 Using Custom Components

If a custom component developed in JavaScript is to be used then it should be stored in JavaScript format (with js extension) in the ltDENODO_HOMEgtmetadataitp-custom-components directory Each component is represented as a js file the name of which matches the name of the custom component The main function of the custom component is ltcomponentgt_main(Inputelement) where ltcomponentgt is the name of the custom component as mentioned in the previous section To use a custom component from a wrapper developed in JavaScript the following piece of code should be used

try SCOPEcreate() mycustom = new CUSTOM_COMPONENT(ltcustomcomponent_typegt) mycustomsetComponentName(ltcomponent_namegt) mycustom_output = mycustomexec(ltinput_parametersgt) finally SCOPEclose()

Figure 8 Using custom components from JavaScript

where bull ltcustomcomponent_typegt is the type of the custom component to be used bull ltcomponent_namegt represents the name of the component bull ltinput_parametersgt is the list of input parameters the custom component receives as input

55 WRAPPER DEVELOPMENT

Once the script has been developed creating a wrapper is very simple as the VQL statement has simply to be written as follows

CREATE WRAPPER ITP ltnamegt [MAINTENANCE FALSE] jscode

where jscode is the recently generated JavaScript code

NOTE The VQL syntax uses quotes to delimit the JavaScript code so if quotes are to be used internally they must be escaped with the lsquorsquo character

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES
Page 68: Denodo ITPilot 4.6 Developer Guidehelp.denodo.com/.../4.6/DenodoITPilot.Developer.pdf · 2017-10-25 · ITPilot 4.6 Developer Guide Deploying and Invoking ITPilot Wrapper Access Web

ITPilot 46 Developer Guide

References 63

REFERENCES

[AXIS] Apache Axis Web Server httpwsapacheorgaxis

[DATEFORMAT] Java Format Representation for dates httpjavasuncomj2se150docsapijavatextSimpleDateFormathtml

[DEXTL] Denodo DEXTL 46 Manual Denodo Technologies 2011

[DOTNET] Microsoft NET Framework httpwwwmicrosoftcomnet

[DPORT] Denodo Virtual DataPort 46 Administration Guide Denodo Technologies 2011

[ECMA262] Standard ECMA-262 ECMAScript Language Specification 30

[GENER] Denodo ITPilot 46 Generation Environment Guide Denodo Technologies 2011

[JDOC] Javadoc documentation of the Developer API

[MIME] RFC 2045 Multipurpose Internet Mail Extensions (MIME)

[NSEQL] Denodo ITPilot 46 NSEQL Manual (Navigation SEQuence Language) Denodo Technologies 2011

[PERL] PERL Language httpwwwperlcom

[USER] Denodo ITPilot 46 User Guide Denodo Technologies 2011

[SOAP] SOAP Version 12 W3C Recommendation httpwwww3orgTRsoap

[VQL] Denodo Virtual DataPort 46 Advanced VQL Guide Denodo Technologies 2011

[WSDL] Web Services Description Language (WSDL) 11 W3C Note httpwwww3orgTRwsdl

  • DENODO ITPILOT 46 DEVELOPER GUIDE
  • INDEX
  • FIGURES
  • PREFACE
  • 1 INTRODUCTION
  • 2 DEPLOYING AND INVOKING ITPILOT WRAPPER ACCESS WEB SERVICES
    • 21 WEB SERVICE TYPES
    • 22 INVOKING SOAP WEB SERVICES
    • 23 INVOKING THE EXPORTED REST AND HTML WEB SERVICES
      • 231 HTML Output Configuration
        • 24 CONFIGURING CONNECTIONS IN THE PUBLISHED WEB SERVICES
          • 3 ITPILOT DEVELOPMENT API
            • 31 CONNECTING TO THE SERVER
            • 32 OBTAINING WRAPPERS
            • 33 USING WRAPPERS
            • 34 PROCESSING QUERY RESULTS
              • 341 Canceling Queries
                • 35 EXAMPLE OF USE
                  • 4 CREATING CUSTOM ITPILOT FUNCTIONS
                    • 41 NAMING CONVENTIONS AND ANNOTATIONS
                    • 42 COMPOUND TYPES
                    • 43 PAGE TYPE
                    • 44 CUSTOM FUNCTION RETURN TYPE
                    • 45 EXAMPLE
                      • 5 DEVELOPING ITPILOT WRAPPERS WITH JAVASCRIPT
                        • 51 INTRODUCTION
                        • 52 REPRESENTATION FORMAT OF A WRAPPER
                          • 521 Initialization of Searchable Parameters
                          • 522 Main Function
                          • 523 Generating the Output Structure
                            • 53 PREDEFINED ITPILOT COMPONENT GUIDE
                              • 531 Introduction
                              • 532 Data Structures
                                • 5321 Record Structure
                                • 5322 Record List
                                  • 533 Common functions
                                    • 5331 onError function
                                    • 5332 debugLevel function
                                      • 534 Add Record To List
                                      • 535 Condition
                                      • 536 Create List
                                      • 537 Create Persistent Browser
                                      • 538 Diff
                                      • 539 ExecuteJS
                                      • 5310 Expression
                                      • 5311 Extractor
                                      • 5312 Fetch
                                      • 5313 Filter
                                      • 5314 Form Iterator
                                      • 5315 Get Page
                                      • 5316 Init
                                      • 5317 Iterator
                                      • 5318 JDBCExtractor
                                      • 5319 Loop
                                      • 5320 Next Interval Iterator
                                      • 5321 Output
                                      • 5322 Record Constructor
                                      • 5323 Record Sequence or Extractor Sequence
                                      • 5324 Release Persistent Browser
                                      • 5325 Repeat
                                      • 5326 Script
                                      • 5327 Sequence
                                      • 5328 Store File
                                      • 5329 Thread
                                        • 54 USE OF CUSTOM COMPONENTS IN JAVASCRIPT WRAPPERS
                                          • 541 Developing Custom Components
                                          • 542 Using Custom Components
                                            • 55 WRAPPER DEVELOPMENT
                                              • REFERENCES