Discovering TTL using LD - Rolf Nijenhuis

116
Discovering Trusted Trade Lanes using Linked Data Master Customs & Supply Chain Compliance Delft University of Technology Faculty: Technology, Policy and Management (TPM) Master Thesis by Rolf Nijenhuis Final version, February 2015 Graduation committee: Yao-Hua Tan (graduation professor) Joris Hulstijn (first supervisor) Wout J. Hofman (second supervisor)

Transcript of Discovering TTL using LD - Rolf Nijenhuis

Page 1: Discovering TTL using LD - Rolf Nijenhuis

Discovering Trusted Trade Lanesusing Linked Data

Master Customs & Supply Chain Compliance

Delft University of TechnologyFaculty: Technology, Policy and Management (TPM)

Master Thesis by

Rolf Nijenhuis

Final version, February 2015

Graduation committee:

Yao-Hua Tan (graduation professor)Joris Hulstijn (first supervisor)Wout J. Hofman (second supervisor)Wim Visscher (external supervisor)

Page 2: Discovering TTL using LD - Rolf Nijenhuis

Summary

Abstract

We may look at global supply chains as a black box. They are operated by many parties in different settings with their own objectives and they are not aware of the supply chain as a whole. Customs authorities have difficulties in fulfilling their risk management duties on safety and security for shipments entering the EU. They should infer minimal disturbance to the business flows. The EU Customs Union has expressed that it is lacking information on the parties behind the transactions. It is about to change its legislation (UCC) and introduce multiple filing which will allow different stakeholders to add data to the Entry Summary Declarations (ENS). We investigate this proposed change and question if it will result in the required data on parties.

For better organization of supply chains, also concerning governmental cross border control, they need to improve their visibility and interoperability. The parties, events, costs and assurances need to be known in order to improve and optimize. And the parties should be able to work together efficiently. Visibility is enhanced by sharing data and information. Interoperability improves when the IT-systems of parties connect according agreed standards, using shared data. These aspects have been investigated in EU-projects over the past decade and research continues in the CORE-project. We explore the main findings and concepts as Information Infrastructure, Data Pipeline, from Data Push towards Data Pull, and Piggy Backing. The tendency in these research projects is that cross border inspective authorities would like to capture data on supply chains from the IT-systems of the stakeholders involved instead of waiting for declarations.

The technical developments of Semantic Web and Linked Data could contribute to improve visibility. In this paper we explore what these are and we observe that Linked Data is more than linking data. Systems using linked data must adhere to the Open World Assumption (OWA) and apply monotonic logic. Decision making processes should be aware that the available data may not be complete. We found that traditionally closed (world) systems could change and apply open world reasoning, considering the process of risk assessments on EU entering goods. We also found that Linked Data could serve to implement the Data Pipeline concept.

The Dutch Enforcement Vision by Dutch customs has provided the concepts of Optional Multiple Filing and Trusted Trade Lane (TTL). We found that these relate to open world logic and that they may be realized using Linked Data.

This paper presents an exploratory research on how customs may benefit from linked data techniques for risk assessments on maritime shipments entering the EU. Extra information on the shipments will be presented on a Customs Dashboard. It focusses on the need to know the parties in the trade lanes and the ambition to discover TTL’s. Knowledge of such trade lanes would ease the risk assessments and speed up trade.

We propose the Guided Open World Assumption (GOWA) as a means of sharing data in the supply chain using a combination of existing closed world methods and the open world of data on the web. It consists mainly of making use of id’s and seeds of data in the ENS, like container number, to find linked data. Within this limited scope, the data can be assumed to be complete.

Linked Data is only possible when data are offered by some party to share it. The assumption that parties would want to share data with customs is also underlying the concept of Optional Multiple Filing. In our research design we used this method as an application scenario for linked data.

Our abstract design is based on a case study of the deep sea trade lane of roses from Kenya to Holland, as researched in CORE. We used its investigation in stakeholders in a

page 2

Page 3: Discovering TTL using LD - Rolf Nijenhuis

proposition to develop the domain knowledge and the ability to define degrees of TTL’s. In effect we believe that the colour yellow for a TTL, representing a known trade lane of trusted traders, will occur in many different shades, depending on the level and amount of acquired data. For this we propose the usage of linked data by Data Analytics techniques to automate the development of insights.

Our research shows the possibilities that linked data offer in creating more visibility for Customs Risk Management. This requires new mind-sets and new technologies but our basic research design shows that it may grow from a base-level.

This paper recommends that customs authorities are the appropriate stakeholder to start up Linked Data developments and start working on the shift towards open world systems.

Graduation committeeYao-Hua Tan (graduation professor)Joris Hulstijn (first supervisor)

Professor Delft University of TechnologyAssistant Professor Delft University of Technology

Wout J. Hofman (second supervisor) Senior Research Scientist TNO DelftWim Visscher (external supervisor) Strategic Intelligence Dutch Customs Authorities

Keywordsdata pipeline, Dutch Enforcement Vision, global supply chain, linked data, ontology, open world, trusted trade lane, interoperability.

ReferencesNotations like (Heath and Bizer 2011) or Heath and Bizer (2011) are references, explained in chapter . Web links are presented as (link; 2015) and are explained in Annex 7.Interviewed people are referred to by their name and are listed in Annex 3.

page 3

Page 4: Discovering TTL using LD - Rolf Nijenhuis

Summary....................................................................................................................................................2

1. Introduction........................................................................................................................................6

1.1. A customs problem..................................................................................................................6

1.2. Research setting.......................................................................................................................7

1.3. Linked data...............................................................................................................................8

1.4. Research scope and design.......................................................................................................9

1.5. Research methods..................................................................................................................10

1.6. Research questions.................................................................................................................11

2. Data quality ENS and multiple filing..............................................................................................12

2.1. Stakeholders...........................................................................................................................13

2.2. The UCC and multiple filing.................................................................................................14

2.3. The Dutch enforcement vision...............................................................................................15

2.4. Trusted Trade Lanes (yellow lanes).......................................................................................16

2.5. Intermediate conclusion.........................................................................................................17

3. Trying to open the supply chain black box......................................................................................18

3.1. Black box supply chains........................................................................................................18

3.2. Interoperability and Information Infrastructure.....................................................................18

3.3. Data pipeline..........................................................................................................................20

3.4. Innovative concepts................................................................................................................21

3.5. Concepts in architecture.........................................................................................................21

3.6. Research on Linked Data.......................................................................................................23

3.7. Intermediate conclusion.........................................................................................................26

4. Web of Data.....................................................................................................................................28

4.1. Web of data for interoperability.............................................................................................28

4.2. The foundations and concepts................................................................................................29

4.3. Resource Description Framework (RDF)..............................................................................30

4.4. Domain modelling..................................................................................................................33

4.5. SPARQL................................................................................................................................35

4.6. Initiatives in practice..............................................................................................................38

4.7. Adapting to the Open World..................................................................................................38

4.8. Intermediate conclusion.........................................................................................................40

5. Design proposal and case study.......................................................................................................41

5.1 The Guided Open World Assumption...................................................................................41

5.2 Design proposal for discovering Trusted Trade Lanes..........................................................43

5.3 Case study fresh roses............................................................................................................51

5.4 Intermediate conclusion.........................................................................................................55

6. Evaluation........................................................................................................................................56

7. Conclusion.......................................................................................................................................58

7.1. Societal and scientific relevance............................................................................................61

7.2. Limitations.............................................................................................................................61

page 4

Page 5: Discovering TTL using LD - Rolf Nijenhuis

7.3. Further research......................................................................................................................62

7.4. Reflection...............................................................................................................................63

7.5. Acknowledgments..................................................................................................................63

8. Annexes...........................................................................................................................................64

8.1 Annex 1 Ontology example...................................................................................................64

8.2 Annex 2 Ontology optional filing..........................................................................................65

8.3 Annex 3 Interviewed..............................................................................................................67

8.4 Annex 4 Abbreviations..........................................................................................................68

8.5 Annex 5 Data elements..........................................................................................................69

8.6 Annex 6 Questionnaire...........................................................................................................70

8.7 Annex 7 Web links.................................................................................................................71

9. References........................................................................................................................................72

page 5

Page 6: Discovering TTL using LD - Rolf Nijenhuis

1. Introduction

Global supply chains are of great importance for our economies and wellbeing. They have expanded enormously over the last decades and will expand further. Cross border authorities and commercial stakeholders have a shared interest in interoperability of the parties involved and in visibility of what happens when, where and by whom in the chain of activities (Hesketh 2010). The general setting for this research is to improve this visibility.

We may simply describe visibility as the measure to which information on the processes are shared with stakeholders and interoperability as the ability of parties to cooperate. In general we may agree that for the benefit of the whole chain parties want access to more data to be able to make better decisions and in a faster way.

Authorities like customs will be able to facilitate trade lanes appropriately if they have information on reliability of involved traders and goods. The available data for Customs Risk Management processes on goods entering the EU as provided by the ENS’s have proven insufficient for this purpose (Commission 2012).

IT-innovations are required to realize improvements. These have been and are researched by EU-projects over the last decade. Our main research question is:

How can Linked Data technology contribute to the improvement of the visibility of supply chain stakeholders for customs risk assessments on goods entering the EU?

Such visibility is evident in the open concept of Trusted Trade Lane as is coined by Dutch customs (Frank Heijmann, dutchvision; 2015). We will use this concept as the direction to work to in our research design.

1.1. A customs problem

Customs authorities are facing a huge challenge. They should not disrupt the (growth of) global trade and must have a proper risk management in place to ensure safety and security on the enormous quantities of incoming goods. They find themselves in the dilemma between facilitating and controlling trade by checks. The 9/11 events have clearly had great impact and the threat of terrorist attacks is still very actual (isis; 2015). But also the penetration of counterfeit goods and smuggling associated with containerized deliveries, especially of narcotics, must be guarded against. Customs are not the only border protection authorities. They collaborate with other agencies like tax, veterinary inspection, plant protection, border police, harbour authorities, immigration office and many more. In this thesis we will concentrate on customs, as well as on maritime container imports. EU-customs should have insights in which parties are and have been involved in producing, trading, packing and transporting the entering goods. But also in the backgrounds, production and origin of the goods themselves and of the routes and means of transports used. This information is dispersed throughout the complete supply chain that takes care of the goods flow into the EU. It is not an easy task to get hold of these data (Hesketh 2010). In this sense we can regard the global supply chain as a black box (Jensen, Bjørn-Andersen et al. 2014).

Customs authorities carry out risk assessments based on the data of ENS’s, which have to be lodged prior to loading. The procedure of pre-load declarations had been installed rather hastily in 2011 as a reaction on the 9/11 terrorist attacks (Roel van ‘t Veld), following the recommendations of the WCO in its ‘Safe framework of standards’ of 2007. It has shown that the quality of the ENS-data in general do not meet a required level (Commission 2012). The sea-carrier is responsible for filing the declaration. At first sight, it is the most logical decision to address the requirement to this stakeholder as they bring the goods across the EU border. But it is questionable if he is the right person to request for detailed information on

page 6

Page 7: Discovering TTL using LD - Rolf Nijenhuis

commercial parties and transactions, and carried goods. He has no interest in knowing these details. In case of an insurance claim, the carrier’s liability will be limited to the number of packages. This may change if he is aware of the values of the goods (rdamrules; 2015).

It is clear already that the European Commission is of the impression that other parties than the carrier would be better informed and should be able to provide for data. The new EU customs law UCC (UCC; 2015) allows that other parties add to the filings, while keeping the carrier responsible for the overall ENS declaration. This option is sometimes referred to as dual or multiple filing, next to the filing of the carrier. The UCC Implementing and Delegated acts on the UCC, which are under development now, will elaborate on this option. They must be implemented in 2016 (Commission 2014, Commission 2014-2). For this research we have explored the second drafts of these acts which were public in March 2015.

Dutch customs have set up a similar process which they call optional (multiple) filing. We will further address it as optional filing. This filing is based on the free will of stakeholders and motivated by facilitation of trade lanes, and is not an ‘official’ declaration. (see also paragraph 2.3).

1.2. Research setting

Much research on improving global supply chains has been performed within the EU. The case study we use in this research is based on participation in the still running CORE-project. It forms part of the EU’s Seventh Framework Program for Research (FP7; 2015). It was preceded by the ITAIDE and CASSANDRA projects, and others. These projects unite the different parties that are concerned with global supply chains like authorities, commercial parties, carriers, software businesses and business communities, but also researchers . Interoperability in global supply chains is a challenge considering the fact that they will connect parties with different languages, motives, cultures and expectations (Klievink and Lucassen 2013).

Many intended improvements are IT related and linked to IT innovations. In this sense the concept of the data pipeline came up (Klievink, Van Stijn et al. 2012). It was first defined by David Hesketh (UK Customs) and Frank Heijmann (DCA). By creating a data pipeline which gives access to all data involved in the supply chain, businesses will be able to improve the chain. Undue costs will be located, unnecessary delays will be less frequent. And authorities like customs will be better informed if they can use this pipeline to consult data. The quality of data for risk assessments will improve, is the claim. Customs will be able to check who did what and when themselves (Hesketh 2010).

The data pipeline concept is very challenging to realise and would require a public-private governance (PPG) model to guide it and standards in data definitions and IT-systems (Klievink, Van Stijn et al. 2012).

A general model to which trade lanes might subscribe has proven difficult to design, although some business to business connections have been realized (B. Klievink; E. Geerts). But in trying to get to designs and implementations the concept has pushed researchers in new ways of thinking. Some new concepts relating to it have evolved, aimed at the usage of the data by authorities. Like ‘piggy backing’ (authorities reuse business data or data flows for supervision and control purposes), ‘system based audit’ (authorities use electronic business data and systems of controls to check and audit business administrations) and ‘message paradigm’ versus ‘resource paradigm’ (instead of receiving data in messages, authorities incline to obtain data from the source IT systems) which have firstly been discussed by Tan, Bjørn-Andersen et al. (2010). They relate to more technical concepts like data pull (the receiver takes action) and data push (the sender takes action).

Some research has been performed suggesting the usage of semantic web and linked data to use in data pull also, by Hofman (2011-2) and Karakostas (2014).

page 7

Page 8: Discovering TTL using LD - Rolf Nijenhuis

1.3. Linked data

The world wide web as we know it now has created a revolution in the way that societies, institutions, businesses and most important individual persons operate and behave. The world has been changed by it. The Semantic Web, also referred to as Web of Data and Web 3.0, is taking the web a huge step further. The web now has made it possible to share knowledge by sharing and linking documents and web-sites. The web of data is adding new dimensions in sharing knowledge. Firstly it is able to link on the individual data elements held within documents and sites. This of course widens the sharing of specific information enormously. Secondly it is able to register the semantics of these data in an open and extensible way, available for human checks but also for automated processes. It will be possible to link data and build knowledge not just based on common search criteria in documents but on the data behind the documents and their meaning. The web of data is turning the web from a dumb web into a smart web in the way that it partly takes over human logical thinking (Allemang and Hendler 2011). We may experience this upcoming change by using the Google search engine. It produces links that might not even contain the words used in your search argument but are somehow related to them, sometimes in a way that you wouldn’t have thought of yourself. This is among others because Google is using semantic web technologies and ontologies (wiki-kg; 2015).

The vision of the web of data was first described by Tim Berners Lee, from the early 1990’s, and he also further shaped it (TBL, 2015). The coordination and developments of concepts and models are guided by the World Wide Web Consortium (W3C; 2015). This means that concepts and technologies of the web of data are open source standards.The exciting idea behind the web of data is that the huge amounts of data that we are producing can be linked to each other and produce information and even knowledge that we would not find without it.

For this we need the data to be set up in such a manner that they can easily be linked together. The model for this is provided by the Resource Description Framework (RDF). The meaning of these data, the semantics are actually also provided as data in RDF, and generated by making use of Web Ontology Language (OWL). An ontology describes the concepts of a specific domain, and their relationships, like the domain of customs.

Note that the RDF databases (RDF data stores) do for the most part not contain actual data but only the web-links (URL´s) pointing at where the real data are contained. Their power lies in connecting and linking data, hence the name linked data. The essence is that data are shared and may be linked to by other parties via the web who directly may understand and interpret the meaning of these data, human as well as tools by using formal logic.In this paper we will view the Linked Data as identical to Semantic Web to be clear in terminology and to indicate that, in our opinion, the one cannot exist without the other.A crucial concept behind the semantic web is the Open World Assumption (OWA) which is opposed to the Closed World Assumption (CWA). The CWA is applied in the traditional ‘database’ IT systems and presumes that all required data are present. Therefore, if some fact is not known, we may assume its negation to be true. The OWA assumes incomplete information by default (manch; 2015). In that case, if some fact is not known, we may not assume its negation to be true. The adoption of linked data technology will require the transition from closed to open world practices, which implicates a paradigm shift. We will propose the Guided- OWA as a means to initiate this in a controlled manner.

The OWA is a classic assumption in the academic field of study of Artificial Intelligence (Russell and Norvig 1995). They build on the view that the data on the web will always be able to grow and can be seen as an organic web. Allemang and Hendler (2011) state in general: ‘An information web is an organic entity that grows from the interests and energy of the communities that support it’.

page 8

Page 9: Discovering TTL using LD - Rolf Nijenhuis

1.4. Research scope and design

This research spans the large complex area of global supply chains, customs risk management and linked data and is guided by the research performed under EU-Projects. We investigate the IT-innovations with specific attention for the application of Linked Data. We further focus on improving the quality of electronic data for customs risk assessments on maritime container transports entering the EU regarding the visibility of stakeholders. In our conclusion we will evaluate our findings on using linked data on the broader level of supply chains in general and the communication between EU Member States.

Figure below gives an overview of the research design. The research is characterized as exploratory design science with a sense-making type. The requirements are viewed from the perspective of EU-Customs. They are rather abstract because of the sense making intentions of this research but also because of reasons of customs security and privacy. In this research we focus on creating visibility on stakeholders in the supply chains.

Figure 1 Research overview (according to Hevner (2007)

The design is based on a specific case study, namely the import of roses from Kenya, and has been checked and evaluated by means of a questionnaire.

page 9

Page 10: Discovering TTL using LD - Rolf Nijenhuis

1.5. Research methods

The research has been performed as a combination of an exploratory qualitative research and design science (Hevner et al 2004). Desk and literature research have formed important parts, in the areas of the EU-projects and Linked Data. The CORE-project provided me with actual information and documentation, and interviews provided background information. CORE makes use of the living lab method (Tan, Bjørn-Andersen et al. 2010), a kind of action research in testing out innovative ideas.

Workpackage 11 of the CORE-project is focussing on maritime trade lanes on the port of Rotterdam. The trade lane of fresh roses from Kenya to the flower auction FloraHolland is under its research and development also. Because of my involvement and participation in this I was able to gather data and information on the trade lane and the progress made, up to June 2015. This provided me input to analyse the proceedings and to overall design an approach making use of the principles of Linked Data. My involvement also gave me access to research results of CASSANDRA and intermediate reports within CORE. I have used this material carefully and referenced it as confidential if this was the status of the documentation. Living labs are defined as ‘a gathering of public-private partnerships in which businesses, researchers, authorities, and citizens work together for the creation, validation, and test of new services, business ideas and technologies in real-life contexts’ (Klievink and Lucassen 2013).

This resulted in a proposal design for a practical and actual problem concerning customs risk assessments, representing the design research part.

The research design resulting in a research framework (figure 2) with research questions was carried out according the practice oriented research method as described in (Verschuren, Doorewaard et al. 2010).

The results of the questionnaire to stakeholders in the CORE project were used in the last chapter with evaluation and conclusions.

page 10

Page 11: Discovering TTL using LD - Rolf Nijenhuis

1.6. Research questions

Our main research question is:How can Linked Data technology contribute to improving the visibility of supply chain stakeholders for customs risk assessments on goods entering the EU?

We will substantiate the answer by addressing the following sub questions. Sub question 1

We will use Optional Filing as the application scenario for Linked Data in our research design. For this we will address the following sub question.What does ‘multiple filing’ mean, as proposed by the European Commission, and what are its (dis)advantages compared to ‘optional filing’, as proposed by Dutch Customs instead?

Sub question 2 What are state of the art findings in improving interoperability and visibility of global supply chains by IT and Linked Data?

Figure 2 Research Framework

Sub question 3What are the characteristics of Linked Data that could improve mutual interoperability of businesses and cross border authorities like customs?

Sub question 4 What design could serve an implementation of Linked Data technology for EU-customs risk assessments and based on which requirements?

Note that the case study is not presented in the analysis chapter as could be expected, but in the design chapter. We chose to do this for reasons of design overview and redaction.

page 11

Page 12: Discovering TTL using LD - Rolf Nijenhuis

2. Data quality ENS and multiple filing

In 2012 (Commission 2012) the European Commission formulated directions for a strategic approach to improve customs risk management on goods entering the EU. It addressed the inadequate data quality of Entry Summary Declarations (ENS’s) and stated this problem as follows:

To make full use of risk assessment methods, one needs to know ‘who is moving what, towhom, from where’. Data on the real parties behind the transaction and the movement ofgoods (buyer and seller or owner), and on the precise goods involved, is essential as isinformation on the routing of the goods throughout the supply chain.(…)Relevant, high-quality data is fundamental to electronic data-based risk management, but the study suggests that current input does not meet minimum requirements. Moreover, there is a systemic gap in the provision of information on the parties behind the transaction and some other data elements provided are of low quality.

Thus, the current input of data does not meet minimum requirements. The quality of data as received by the ENS is characterized in the elements of timeliness, preciseness and relevance (Commission 2012).

In a recent study we could confirm that the ENS does not deliver quality data. In comparing data of the same shipments to the port of Rotterdam, but declared as ENS (preload) first and later declared as SATO (for temporary storage), we found considerable differences.

MRN

Containers

Weight

Nr. of p

ackag

es

Port of lo

ading

Country co

nsignor

Country co

nsignee

0%

20%

40%

60%

80%

100% ENS<>SATOENS=SATO

Figure 3 Differences in ENS versus SATO declarations for the same shipments (Ozturk, Nijenhuis et al. 2014)

Not only significant deviations on weight and number of packages, but even in containers and countries of loading, of consignors and consignees were detected. No further research to find causes for these differences was performed but it was expected that one reason for the differences was that the lodging of the ENS is required at a much earlier moment in time, after which still changes are performed.

For this research we will focus on what the commission described as: ‘there is a systematic gap in the provision of information on the parties behind the transaction’. Customs want to know which parties were involved. The data quality aspect of relevance as referred to by the European Commission relates to this issue. We will address the aspects of stakeholders next.

page 12

Page 13: Discovering TTL using LD - Rolf Nijenhuis

2.1. Stakeholders

Many different stakeholders form the supply chains and customs are very interested in knowing them. For overview we made a short analysis.The international supply chains involving The Netherlands have been researched in 2013 (Veenstra, de Putter et al. 2013). A conclusion was that the many parties involved in the supply chain of goods by containers can be organized in three categories:

a. Parties who are owner of the goods, at some moment in timeb. Parties who take responsibility in organizing the transport of the goodsc. Parties who actually have the goods at their disposal for some time

Supply chains may take countless forms in which stakeholders can take even different roles of the above categories, in which the cross border authorities even lack. The figure below presents a simplified model of the stakeholders involved in a global supply chain using maritime containerized transport. It is to show the different levels and interests, indicated as stakeholder layers. These layers correspond to the categories above (owning, organizing transport, goods at disposal) and include the inspection authorities.

Note that this is a just an overall schema covering only the deep-sea transport leg. The number of stakeholders can be far higher and stakeholders may play several stakeholder roles. The freight forwarder can also be the consignor and / or the agent, or even the carrier. And the parties involving trade lanes change dynamically as well.

Figure 4 Stakeholders and layers in the flow of global containerized maritime trade

We can see several levels in communication lines. The buyer and seller set up the commercial contract. The consignor and consignee agree on content, transport and delivery. The freight forwarders have logistics technical oriented contacts and contracts. The carriers’ agents are concerned with the communication between freight forwarder and carrier and with the agents

page 13

Page 14: Discovering TTL using LD - Rolf Nijenhuis

at the other (importing) side. They most often also communicate with governmental and port authorities. The ocean carrier is concerned with the transport of containers and is in general only dealing with the larger freight forwarders. The number of carriers and bigger freight forwarders is limited compared to the stakeholders concerned with the commercial transactions. The commercial parties who are more interested in the goods than in the transport of the goods, in the foreland as well as in the hinterland, are (mostly) not known to the carrier.

The stakeholders dealing with the transport, as depicted by the red lines, are evidently of less relevance in performing risk assessments on safety and security. The EU seeks to know the other stakeholders also (Hesketh 2010) and aim to be provided with the names and addresses of seller and buyer.

We may conclude that global supply chains are executed by a complex mixture of stakeholders which may adopt a variety of changing forms. The EU is trying to gain more insights in their compositions and collaborations.

2.2. The UCC and multiple filing

The Union Customs Code (UCC) comprises the new legislation for EU-customs. It was published in October 2013 (Regulation 952/2013 of the European Parliament and the Council) and came into force on 30 October 2013. For lodging of the ENS it holds the sea-carrier liable, as was already the case in the previous legislation. This is the traditional party for customs to approach as a party carrying the goods. The carrier shall provide the detailed information not only about the cargo but also parties behind each transaction, such as seller and buyer of commodities.

But since this procedure proved not to deliver the required data quality, the UCC introduces the multiple filing procedure. This means that different parties may lodge the ENS with more than one submission. According to the article 127 (4) UCC, the ‘carrier’, as a party transporting goods, shall lodge the ENS. However, the ENS may also be lodged by the importer, or consignee, or other person in whose name the carrier acts or by any person who is able to present the goods at the customs office of entry. Article 127 (6) UCC states when the ‘security and safety’ information can’t be provided by the carrier, then other persons, which have and can provide this information, may be required to provide this data. This means the carriers are still liable for the sharing of pre-arrival data with customs, even if other parties in the supply chain have provided this information.

The exact data that should be lodged is under discussion still and will be communicated in Implementing and Delegated Acts. See Annex 5 for an overview of data elements, based on the second drafts of the before mentioned acts. Column 1 of it holds the data that are to be lodged in the ENS under the actual law (IPCC). Column 2 to 4 give options to lodge all by the carrier, or in partial sets by the carrier and another party. The definite acts will come into force in 2016.

We can conclude that the quality issues as addressed by the commission are not due to lacking obligations by the stakeholders. The required obligations are met, the carriers file what is requested and within the stated terms. And they may change and extend the filing when more accurate data is becoming clear, even when the goods are on sea already.The situation that the EU finds itself in is thus presumably due to the first set up of the ENS regulations per 1-1-2011, which was performed in a rather speedy manner (Roel van ‘t Veld). A change in approach should therefore be carefully considered now.

In their communication 2014/527 (Commission 2014-3) the commission adds to the 2012 communication and further describes and explores principles of risk management on pre-arrival. Concerning the data quality of the ENS was stated:

Improve data quality and filing arrangements (Commission 2014-3)FOR the timely submission to customs authorities of high-quality and comprehensive data regarding international supply chain movements of goods crossing EU borders;

page 14

Page 15: Discovering TTL using LD - Rolf Nijenhuis

BY adjusting the EU legal, procedural and IT systems to ensure that operators with a role in the commercial supply chain can submit required information, including advance cargo information, in a harmonised way, taking into account international standards, without undue costs for business models or for customs authorities.

It is obvious that the information known by the commercial stakeholders is wanted. The commission has addressed the issue of ‘undue costs’ here. Only a limited number of stakeholders is concerned with logistics (the carriers) and has installed costly automated systems for filing the ENS.

Imagine now that the smaller stakeholders next to the sea-carrier, which consist of a very considerable larger number than the carriers (worldwide some 70 carriers in air and maritime) who file in the existing situation, would also need to file. This would create an untenable situation in which the incurred costs would be raised out of proportions. The ENS- declarations already cost business some 60 million euro’s yearly (Veenstra, de Putter et al. 2013) and asking for an exponential growth of these costs would create large opposition.

2.3. The Dutch enforcement vision

Dutch Customs Authorities have published their Enforcement Vision (Dutch vision; 2015). It contains elements of changing the regulatory regime from prescriptive regulation to responsive regulation by which compliance is determined by achievement of desired results (Haiko van der Voort; Wim Visscher). This vision is supported by figure below indicating the concept of trusted trade lane. It lends some interesting aspects on data and quality of data. In following the flow of goods in the international supply chains we can distinguish between green, yellow and blue chains or trade lanes. Green lanes are lanes of which the traders and parties in the chain are known and trusted because of applied certifications.

Green lanes can become yellow when all parties have proven to be innovative and fully cooperating with customs. Yellow (trusted) trade lanes are also called Smart and Secure Trade Lanes (SSTL). They are proven trustworthy and visible for customs and provide for physical integer product quality. They offer an interoperability according standards, for instance by allowing customs to audit its internal business by system based auditing. In this way customs will know who packed the box.

The blue lanes stand for the chains of which traders and trade lanes are not known by customs. Note that these traders may be not certified themselves. It may also be the case that they are certified but that this fact is not known to customs at the moment of assessing the risks. In general the trade lanes which do not comply to ‘green-standards’ set will be considered as blue. These standards may change and involve more than requiring certifications.

page 15

Page 16: Discovering TTL using LD - Rolf Nijenhuis

Figure 5 Towards trusted traders and trusted trade lanes (Dutch vision; 2015)

The data entering the prism in the figure above also include the pre-arrival data of the ENS. The vision implies that the declared data will result in the customs knowledge to which kind of trade lanes they belong. Customs may pay less attention to the goods travelling through green and yellow lanes than to the goods that are dealt with by unknown traders. Customs will put their valuable and scarce attention to the latter resulting in relatively more control activities and physical inspections. In order to determine the type (colour) of traders customs must obtain relevant data and for this it will need supply chains to be visible.

The Dutch Vision also comprises the proposition of optional (multiple) filing. This concept is based on two notions. Firstly the idea that in a trusted trade lane stakeholders like the importer may have an interest in filing extra data on the actors in the foreland and hinterland, including himself. They will want to let customs know that this is really a green trade lane in return for facilitations by customs to avoid disruptions in trade. And secondly that it is not advisable and even impossible to oblige every importer to install a system which connects to customs in order to be able to file extra data, as was explained before.

For this research we find the optional filing of specific interest because it comprises the technical element of data pull and it relates to the open world and OWA, because it doesn’t assume that all required data are present, but instead that some data is optional.Optional filing could be set up making use of linked data concepts. Such an approach would initiate a change from closed to open systems. We take this as a starting point for our research design.

2.4. Trusted Trade Lanes (yellow lanes)

The concept of Trusted Trade Lane (TTL) is key concept within the Dutch Enforcement Vision. It aligns with the EU customs goal to obtain more insights in the compositions and collaborations of global supply chains. But Dutch customs did not exactly specify a definition and we may state that this TTL-concept is currently open.

We may globally define a TTL as a trade lane of which the complete set of stakeholders can be regarded as trusted traders, which is functioning in a way that is known and trusted by customs. For this customs should be able to identify trade lanes on the one hand and to determine their trustworthiness on the other (Wim Visscher, Frank Heijmann, dutch vision; 2015).The awareness of and identification of a trade lane will be supported by identifying a main representative party, like the importer. And the involved stakeholders should have realized a stable mutual collaboration. And of course the (type of) goods themselves and the physical integrity of the goods are very important, but in this paper we will concentrate on the awareness of the stakeholders. In the complex and constantly changing black box environment of global trade it appears challenging to define trade lanes alone. Trade lanes are mostly not recognizable as a whole. The individual stakeholders have no direct objectives to generate insights in the whole and their business may even profit by obscureness of their activities in a bigger picture.We may see two variants of defining trade lanes.

1. By agreement (between customs and business)2. By pattern recognition (data analytics)

If formal agreements can be made for instance by the representative chain partner, on identifying and functioning of a trade lane, then this may directly apply to the certifications and intentions of partners to perform as a reliable trustworthy trade lane. Customs might agree on pre-defining a level of trust accompanied by a certification for the trade lane. We may characterize this formal agreement on certification as the ‘meeting room’ variant of TTL. Another more explorative way of finding TTL’s is by pattern recognition on supply chain data. If customs would be supplied with enough reliable data by business trade IT-systems,

page 16

Page 17: Discovering TTL using LD - Rolf Nijenhuis

then customs should be able to detect patterns by which they could identify and define trade lanes. Trade lanes will have typologies, like guidance by one leading party, memberships of stakeholders or peer-to-peer collaborations. Customs would create an awareness for trade lanes and the measures of trust that would fit the experiences with these trade lanes of customs in recent history.

Customs may use data analytics not only for creating awareness for trade lanes. In the ‘meeting room’ variant also monitoring of the performances of trade lanes will be necessary in order to check the compliance. The Dutch Enforcement Vision emphasizes that TTL’s will allow customs to pay more attention to the unknown more suspected trades. But the monitoring of trusted trade lanes should also be performed, to be in control and to monitor changes and deficiencies. Such controls are in line with the concept of System Based Auditing, where monitoring is also fully automated.

Note that in both variants, monitoring the continued trustworthiness of a trade lane will play a crucial role. Therefore, techniques to enhance data visibility will play a large role in both cases.

2.5. Intermediate conclusion

It is the intention of the European Commission as shown in the UCC and its (drafted) acts to require more data about the goods directing to the EU from the carriers. In general, we think it is a good intention, as these new data elements will provide customs with a better visibility over the parties behind import transactions. But the UCC is building on an already existing ‘closed world’ approach. And we can conclude that this may result in a poor quality of pre-arrival data, which really forms a societal problem.

An important aspect that may jeopardize the commission’s intentions are the costs involved in lodging the ENS in multiple filing by many smaller stakeholders, and the challenge that would be created towards enforcements, if lodging the required data would be set mandatory for the carrier.

We set the following research sub questions for this chapter: what does multiple filing mean, as proposed by the European Commission, and what are its (dis)advantages compared to optional filing, as proposed by Dutch Customs instead?

Multiple filing means that the ENS may be lodged in multiple goes. The carrier remains responsible for the filing of all requested data. It is allowed that another party will lodge additional data, defined as a partial set, by a person. If another party will do this then the carrier will only need to file ‘his’ partial set of data. If not then the carrier should file all data elements. And even if another party will file additional data then the carrier will still be held liable for the content.

The advantage of multiple filing compared to optional filing is that it is clear that the carrier is responsible for all data and that all required data elements are specified. But this advantage may prove of little importance if the carrier cannot and will not provide the requested data. The advantage of optional filing would be that parties who have the ambition to improve and open up their trade lane data, may contribute in a relatively easy way by publishing additional data for customs to capture. This method is based on motivated cooperation rather than on enforcement.

Multiple filing builds on the traditional closed world principle that data must be lodged, where Optional filing adheres the OWA and this is not mandatory. It this sense it is interesting to add a dimension to our research sub question which compares the filing systems, and try to compare an open to a closed world approach. What are the main differences considering the desired (business to customs) interoperability by data communication and the trustworthiness and reliability of supplied data, in applying a closed world versus an open world approach? We will address this extra question on the supply chain characteristics of interoperability and reliability in our final conclusion in chapter 7.

page 17

Page 18: Discovering TTL using LD - Rolf Nijenhuis

3. Trying to open the supply chain black box

In this chapter we will focus on the findings and results of the scoped EU-projects as CORE in improving interoperability and visibility of global supply chains by IT innovations.

3.1. Black box supply chains

Global trade is very complex (Veenstra, de Putter et al. 2013). It consists of many stakeholders with different interests, cultures, aims and IT-systems who must deal with many authorities with their specific national rules and regulations.

We may consider the physical and administrative processes of trading goods internationally as global supply chains. But the businesses and stakeholders forming links within a supply chain are often not aware that they are part of a chain. The processes and parties in the chains may be compared with a black box with some beginning and some end. Jensen and Vatrapu (2015) have investigated the trade lane of roses from Kenya to Holland and concluded that the stakeholders act as single parties and do not realize they are part of a larger trade lane. The information and knowledge of a supply chain as a whole is lacking, the chain is ‘invisible’. Jensen and Vatrapu question whether parties will always benefit by creating visibility in the performance of the chain as a whole. The international trade cost of the trade lane forms roughly 1/3 of the total cost of the product, and 2/3 of this cost is accrued from barriers that exist when crossing national borders and organizational boundaries. To be able to lower these costs will be disadvantageous to some parties involved. They profit of the obscureness of the trade lane and its regulations in which they seem to fulfil a difficult and non-transmissible role.

The requirements of interoperability and visibility are not easy to realize in global supply chains. The situation where businesses operate in peer to peer relations with connected partners are often guarded as the situation in which the business has best chances to survive, in a very competitive trader and transport world. Also different cultures, languages and systems, and physical distance form barriers in sharing and showing insights and trust.In fact, the main topics in government of today’s international trade are efficiency and security. Outsourcing, consolidating cargo and multi-modal transport chains have complicated the organisation and optimisation of logistics and have put additional challenges to managing information and data in these logistic chains.

In addition, the landscape of information systems in international logistics is much influenced by its own legacy (Klievink and Lucassen 2013). This forms an extra complication in creating a standard procedure to let these systems interoperate with a central system, be it a data pipeline or a system by authorities, making it very complex and costly.

3.2. Interoperability and Information Infrastructure

The ITAIDE-project had defined the concept of information infrastructure as a general framework (I3-framework) for these projects. This framework builds mainly on IT innovations and critical IT capabilities, but it also references partner collaboration in general as important factor. Klievink and Lucassen (2013) have investigated the possible major barrier for making innovation work: actors need to be open about their operations, processes and systems to parties that are geographically and culturally on the other side of the world, in a highly competitive environment. For better coordination and functioning of chains as a whole it is important that the different parties work together well. We define the aspect of the ability to do this as ‘interoperability’. It is often projected on IT as an important enabler. Pokraev, Reichert et al. (2005) have described interoperability as the ability of systems to

page 18

Page 19: Discovering TTL using LD - Rolf Nijenhuis

provide and use each other’s services effectively. Initially the term interoperability was used as ‘the ability of two or more systems to exchange information and use that information’ (Association 1990), but recent accounts also take social, economic and organizational factors into account. These factors are also crucial in case of coordinated border management, where different regulatory and inspection agencies start from different backgrounds and objectives.

Pokraev, Reichert et al. (2005) investigated the concept extending to semantic and pragmatic interoperability. Systems communicate and interoperate by sending messages. When the meaning of these messages is interpreted by the receiving system in the terms of the sending system then the domain definitions of sender and receiver are coordinated. This is semantic interoperability. When the messages produce the effect as intended by the sender, this is practical interoperability.

Although making use of Semantic web technologies implicitly means improving semantic interoperability, this is not the focus of this paper. We research to improve interoperability by making use of semantic web and linked data technologies, rather than research semantic interoperability as investigated by Pokraev, Reichert et al. (2005).

ITAIDE had defined the I3-framework as a requirement for interoperability. This concept offers a more holistic and socio-technical perspective next to IT-infrastructure only. Klievink states that such inter-organisational information system or IOS, based on IT-infrastructure, would be insufficient if it would only interconnect the different existing systems. In order to do so in the right way the parties should trust and understand each other and be willing to share knowledge and insights. For our research, with the focus on innovative IT applications, it is important to note that IT only will not solve all problems. We need a shared focus, shared beliefs and shared understanding, transcending cultural, language and domain differences.

In their research on a trade lane of avocados from Kenya to Holland, Jensen, Bjørn-Andersen et al. (2014) made recommendations to implement a common information infrastructure and proposed design principles for international trade. The main objective was to reduce cost, especially by speeding up processes. They stressed the need for transparency in handling interactions not only across national borders but also communications across diverse linguistic and cultural contexts. They proposed an Information Infrastructure to solve the continuously increasing number and complexity of integrated information systems. They form a new artefact which has no life cycle since its components already exist and are continuously evolving. It is a shared, open and unbounded, heterogeneous, and evolving socio-technical system consisting of a set of IT capabilities and their users, operations, and design communities (Hanseth and Lyytinen 2010). Hofman calls this a federated system, or system of systems (Hofman 2015).

We may conclude that the information infrastructure concept has touched aspects like trust, shared knowledge, domain specifics, culture and language, which are not or loosely related to IT. We will find later in exploring the web of data that it offers ways to address these other aspects. Agreeing on semantics in advance might help in the interoperability over cultural and physical borders.

It is interesting to note that the OWA has strong analogies to the view of Hanseth and Lyytinen (2010) that new artefacts are evolving in an organic growing socio-technical system.

page 19

Page 20: Discovering TTL using LD - Rolf Nijenhuis

3.3. Data pipeline

One might state that in the course of the EU projects the data pipeline concept has evolved from the concept of information infrastructure which is more generic, certainly in the course of time. This more specific concept was proposed by Dutch and UK customs to improve supply chain visibility, efficiency of trade, compliance, effectiveness of border control and supervision.

Hesketh (2010) stressed the importance of knowing who stuffed the container, and what are the exact data of the packed goods. In setting a marker at when data should already be shared between parties he defined the consignment completion point (CCP) as the moment that the container has actually been stuffed and the details on the goods are known by the stuffing party. This is the first point in time when complete data on the packing lists and purchase orders should be made available in digital format. The packing list should serve as the basis, also for the manifest or Bill of Lading for shipment. If it proves wrong then all following steps in the chain are using the wrong information, which may lead to disturbances in the trade lane. He draws the data pipeline as a kind of IT-bus connecting the parties at different stages involved in the supply chain, involved in the transactions as well as in the physical operations.

Consignee or Importer

Consignor or Exporter

Freight Forwarder

or 3PL Shipping line

Freight Forwarder

or 3PL

Seller Buyer

Terminal at origin

Terminal at destination

transaction

Inland terminal

WarehouseFactory Port of ArrivalPort of

Departure

P

Inland terminal

WarehouseFactory Port of ArrivalPort of

Departure

P

WarehouseFactory Port of ArrivalPort of

Departure

P

operations

GLOBAL DATA PIPELINE

Figure 6 Global data pipeline (Hesketh 2009)

We may note that it is vital that the data registered at the CCP are of the expected quality. The ENS-data, as explained in chapter 2, are assumed to be of less quality because the requested time of lodging is rather early. The data quality aspect of timeliness, as addressed by the European Commission (Commission 2012), should also be concerned with too early delivery as a detrimental factor. Timeliness is crucial. The optional or multiple filing options could be used to supply the right data when they are available, depending on the time limits to be set on these. In the data pipeline concept the different existing IT-systems of stakeholders would be connected by IT-services via a virtual bus, which provides integrated access points to the different sets of information that already exist but are fragmented throughout the supply chain (Klievink and Lucassen 2013).

page 20

Page 21: Discovering TTL using LD - Rolf Nijenhuis

3.4. Innovative concepts

Klievink, Van Stijn et al. (2012) describe that the data pipeline accesses the data of existing information systems, or reliable copies, that are used by the parties in international supply chains. They emphasize that it is an IT innovation to enable capturing data at the source of a supply chain. It is stressed that supply chains could very much improve in efficiency when data become more visible. The benefits for businesses is as an example given in the insights that would be gained to decide for multi-modal transports. Klievink, Van Stijn et al. describe new methods as ‘piggy backing’ and ‘system based audit’ that could be used by governmental authorities to benefit of the sharing of data.

Piggy backing is defined as the reuse of business data and commercial information streams for governmental supervision and control purposes; with system based audit or system based control government examines company’s internal systems to identify and assess potential risks (Tan, Bjørn-Andersen et al. 2010). Both principles imply that authorities like customs could get access to the IT-systems of stakeholder businesses in the supply chain. This is also described as a data pull method, in which the receiving party collects its data actively, opposed to the data push method (Tan, Bjørn-Andersen et al. 2010) . Data push refers to the traditional way of B2G communications, in which businesses are required to deliver their data, information and declarations to the authorities conform strict regulations.

Supposing that businesses succeed in realizing a data pipeline then this would open up new options in the B2G communication. Customs might collect and capture data from the IT-systems of businesses. Some different levels of communication are possible, depending on the availability, the type and set up of the business data. The implementation of the concepts of data pull and data push can lead to major shifts in how businesses and authorities like customs operate. The current paradigm of B2B and B2G data communication is the message paradigm (Hofman and Aldewereld 2011). The carrier submits the ENS to customs. In B2B relations, involving business transactions like the transport or discharge of a container, the sender specifies which data will be pushed to the recipient. In B2G relations the authorities specify the required data.

The message paradigm has brought big advantages to business and authorities using electronic messaging: better data quality, cost reduction, less errors etc. But the downside is that the specification and implementation of messages is costly and time-consuming. Hofman and Aldewereld (2011) listed other downsides of messaging. Messages form closed systems, by which each national customs has slightly different usage, and small and medium enterprises don’t have the expertise and funding to implement all different Message Implementation Guides.

Hofman and Aldewereld (2011) pose the resource paradigm as the alternative for the message paradigm. Companies open up resources for data to be ‘pulled’ by regulators or other partners. They investigated what they called the paradigm shift. Hofman and Aldewereld stated such a shift could lead to a lower threshold for businesses to report because of less costs in development and maintenance of IT-systems compared to installing all adequate messages. This new paradigm holds that resources should make available their data and information to the outside world. We can see containers as resources, which can supply data via smart sealing or security devices. But global businesses and authorities are resources themselves also and should provide for relevant data. This resource paradigm is also adhered by the semantic web. But this does not mean, in our opinion, that this paradigm is aligned with the OWA. It could also be implemented using peer-to-peer connected closed systems.

3.5. Concepts in architecture

Hofman and Bastiaansen (2013) described an integration architecture for seamless interoperability in the trade lane and piggy back options for authorities. In fact this architecture was driven by the new concepts with the data pipeline as main representative.

page 21

Page 22: Discovering TTL using LD - Rolf Nijenhuis

They point out that a centralised approach, using centralised IT systems, is not feasible in large scale logistic ecosystems. The solution adopted by CASSANDRA is based on the definition and standardisation of interfaces. They should allow customs to piggy back on data that is already available to enterprises in logistic chains.

Figure 7 Seamless integration by interfacing (Hofman and Bastiaansen 2013)

Hofman and Bastiaansen designed seamless integration by interfaces. Like the Data Sharing (DS) interface for data sharing in business events. And the Data Capture (DC) interface to be used for the authorized retrieval for information, like by customs in piggy backing.They defined three methods for implementing the DC interface: the push method, the pull method, and the combination of both. They describe the push method as the trader pushing data on specific times to authorities, such as a declaration implemented by a message. And the pull method as the Linked Open Data (LOD) concept, where authorities have direct access to traders’ data, by means of opening access to URLs where the data are to be found. We will discuss LOD as a linked data concept in more detail in chapter 4. Hofman and Bastiaansen use the concept in defining that data will be linked to and therefore become accessible by customs.

Integration patternsIn their classical work on enterprise integration Hohpe and Woolf (2004) listed four integration styles to share data between applications.(1) file transfer (exchange files with data by a common transfer mechanism), (2) shared database (use the same stored data), (3) remote procedure invocation (synchronous direct interaction via interface like API) (4) messaging (asynchronous data transfer by a messaging system).

Messaging (4) is the standard approach for electronic B2G communication (message paradigm, data push).The SOA architecture by Hofman and Bastiaansen makes use of interfacing and will use style (3) by web-services and API. The data-pull method could make use of file transfer (1) but also of web services and API (3).When applying the definition of data-pull, that the request is initiated by the receiver, then style (2) would match this approach. In practice this situation of sharing databases will not be feasible. But when looking at the semantic web we will find that linked (open) data mostly resembles this style, when links provide direct access to other databases.

Wikipedia (wiki-push; 2015) defines the push method as the method where the request for transmission of information is initiated by the publisher and the pull method as where this is done by the receiver. This aligns with the next example. Imagine data or information has to be passed electronically from party A to party B. If party A is in the lead in the communication process, then this is a data push, and the other way around it is a data pull. Hofman and Bastiaansen describe the combination of both when party A notifies party B with identifying data by means of data push, which the last must use to capture the information by

page 22

Page 23: Discovering TTL using LD - Rolf Nijenhuis

data pull. We will address later (chapter ) that this combination is an important aspect in the provisioned communication between traders and customs. Customs must have an indication (‘seed’) on when and what data to pull from which sources that are provided by the traders. We will propose this method in the GOWA concept.

Hofman and Aldewereld (2011) suggest that traders should generate events to which customs could subscribe (publish/subscribe to events). The publications should contain the links by which information can be ‘pulled’. Note that publish/subscribe (pub-sub) is, according to Wikipedia (wiki-pubsub; 2015), a messaging pattern where senders of messages do not send directly to specific receivers. Instead, published messages are characterized into classes, without knowledge of what, if any, subscribers there may be. Similarly, subscribers express interest in one or more classes, and only receive messages that are of interest, without knowledge of what, if any, publishers there are. This view differs from the proposed mechanism by Hofman and Bastiaansen where publisher and receiver have a peer to peer agreement and content. The essence of their proposed architecture is to create loosely coupled systems that use standard web-service interfaces to be provided to specific groups of stakeholders. The usage of the services is to be triggered by events in the supply chains. These publish/subscribe events provide the dedicated receiver with the location of and access to specific data according data governance agreements.

The essence of the architecture by Hofman and Bastiaansen (2013) is that the seamless interoperability as requirement for the global trade lanes should be realised by bringing together the heterogeneous systems and solutions of the different actors by implementing interfaces. This will call for exact definitions whether the technical solution is provided by messaging or by web-services. Traditionally this will lead to peer to peer solutions between the trade lane partners and authorities, whether based on data push, pull or a combination. Hofman and Bastiaansen propose to avoid peer to peer solutions by offering generic published services, stored in distributed service registries, that partners must adapt to. They must still support the required technical specifications like data standards, format and interaction protocol. These are in the form of closed system definitions. The data pipeline architecture by Hofman and Bastiaansen (2013) is set up according the traditional closed world concept. In chapter 4 we will discuss the characteristics of closed world versus open world practices.

The centralisation of B2G services performed by port community systems (PCS) or business community systems (BCS) is good practice in supply chains. These intermediate partners, which may be regarded as service and data brokers, provide for software and dedicated communication systems. They perform for a high number of smaller partners who benefit from these services without the need for own software investments. The CORE projects also involve these service providers. In our case study trade lane in CORE this role is taken by Descartes (descartes; 2015), who acts for traders as service- and software provider towards customs authorities.

3.6. Research on Linked Data

In this paragraph we will look at research results from EU-Projects on using linked data. For ease of understanding it may be preferable for the reader to first read chapter 4 which describes Linked Data itself.The description of the work package for CORE (CORE 2013) does not explicitly reference linked data. It does contain a reference to develop the Core Ecosystem and its knowledge graph (Karakostas 2015). This sub-project has connections with linked data.

The CORE Knowledge graphThe knowledge graph concept is newly coined in CORE work package 8 (Core Ecosystem). It intends to design an architecture for a virtual collaborative platform that enables CORE Ecosystem participants of multiple supply chains to share their knowledge of supply chains for their mutual benefit. Karakostas (2015) describes it as follows.

page 23

Page 24: Discovering TTL using LD - Rolf Nijenhuis

In short, this report proposes how to realise the key concepts of the CORE approach such as the data pipeline, using an overlay ‘system of systems’, a federated virtual superstructure that connects systems and databases of organisations participating in supply chains. One feature of this superstructure is that it forms a huge virtual graph called the Knowledge Graph, that allows new knowledge about supply chains to be discovered (Karakostas 2015).

The ‘graph’ concept is derived from the science of mathematics and it is actually used in and made popular by the semantic web tooling. Google is implementing its knowledge graph as we mentioned before.

In Karakostas (2014) the knowledge graph of the proposed architecture of the CORE ecosystem is described in a RDF like manner (see 4.3), consisting of nodes and relationships, described both on actual and meta level. In fact this knowledge graph is held in the graph database Neo4J (neo4j; 2015).

Karakostas describes it as a top layer visualization of data pipelines, which grows and expands with more and more supply chain data being added to the platform, by means of adding business documents. An example of information from the Knowledge Graph by Karakostas’ case study is presented in the figure below.

Seller: Sunshine Inflatable Toys Co

Ltd Shipper: Sunshine Inflatable

Toys-Exports division

Shipper agent:

China-US Exports Ltd

InternationalContract of saleId: 1234567

Shipping order/packing list: id:77878

Carrier: DHL Global Forwarding

Packing list: id: 5433

Bill of lading: id:43434Hong

Kong ICRIS

Registration number: 4344344

Customs Authority:Hong Kong

Customs

Customs Authority:

US Customs

NY

Import declaration: id: 57575

Export declaration:57575

New York NY

USA

Located at

International contract of saleDocument location: www.xxx.com/yyyyStatus:restrictedCreated: .31/1/2014

Importer /agent: TOYSRUS imports division

Buyer:TOYS RUS Inc.-imports division

Figure 8 Visual representation of a Knowledge Graph (Karakostas 2015)

Karakostas clearly advocates the OWA: ‘Ecosystem participants only contribute to the knowledge graph as much or as little as they wish. Transactional data and business documents are not stored in the knowledge graph. The graph merely points to their existence, availability and location/source. The knowledge graph is a live structure. As supply chains are planned executed and monitored the knowledge graph becomes populated. The more members contribute, the higher the added value of the Knowledge Graph (the network effect).’This view has a strong correspondence to the proposed optional filing, be it on a different level. The Knowledge Graph is meant to produce insights and knowledge on a higher level, where the optional filing has a more direct purpose in making a risk assessment on a specific goods flow. But both views are finally meant to support risk management purposes.

page 24

Page 25: Discovering TTL using LD - Rolf Nijenhuis

The knowledge graph architecture still leaves many questions to be answered. It is not clear where the extracted data are stored. It is stated however that data should be filtered to protect commercial interests. It has been drawn only recently (May 2015). A concrete plan for design and implementation is not foreseen in the near future.

The intention to make use of the network model with its properties to link new data and grow in time, delivering new knowledge, is clear. New in this approach is that the graph database should be fed by all trade lanes in the EU, delivering overall views and knowledge and building a history. Other pipeline studies had only focussed on individual trade lanes.

Visibility with linked data for customs risk analysisHofman (2011-2) has explored the option to use linked data for creating visibility for supply chain risk analysis in a broader view than only to pull data. Hofman posited Linked Data as an alternative to the usage of Service Oriented Architecture (SOA) for retrieval of data from source systems.

Hofman noted that retrieval of data in the linked data approach may follow three patterns (Heath and Bizer 2011). (1) Data-crawling applications crawl the web in advance by traversing links. Afterwards they

clean, fuse and filter the data for usage. This pattern requires data replication. The advantage is that the processes of data capture and data storage are decoupled.

(2) On-the-fly-dereference applications dereference URI’s and links are followed the moment the application requires the data. The advantage of this pattern is that applications never process stale data. Disadvantage is that complex operations following many links are slow.

(3) Query-federation applications send complex queries directly to a fixed set of data sources containing links and so called SPARQL-endpoints (see 4.5). A major disadvantage is that complex queries are difficult to make and mostly perform badly.

In general the decision which pattern or mix of pattern to use will rely on several elements, as referenced by Heath and Bizer (2011): (1) the number of data sources that an application intends to use, (2) the degree of data freshness that is required by the application, (3) the required response time for queries and user interactions and (4) the extent to which the applications aim to discover new data sources at runtime.

Hofman concluded that data-crawling seems the best pattern in retrieving data from supply chains. On-the-fly-dereferencing may not be an option since many traders operate in many different trade lanes. For our case study design we still propose this pattern, since we want to limit the accessible trade lanes by Guided-OWA, as we will explain in chapter .

Hofman designed the data retrieval from the data pipeline and assumed that these data provided for transaction links and URI’s to link to (see next figure 9). Customs has automated data capture processes installed directed towards all transactional data, but may also apply for specific data via a trader. Semantics should be defined for all shared data.

page 25

Page 26: Discovering TTL using LD - Rolf Nijenhuis

Figure 9 Global supply risk analysis (Hofman 2011-2)

Hofman also describes the advantage for businesses to lower administrative costs if they would offer their data to customs to piggy back on, realizing the message paradigm shift.‘Traders, being shippers and logistic service providers, publish their supply chain data according to an agreed ontology including the transaction links.’

Both Karakostas and Hofman describe the importance of ontologies in general. They both point at the WCO-data model that may be (re)used in implementation. We support this suggestion in paragraph 4.4.

Regarding both views in general we conclude that they are ‘high over’ visions. Hofman focusses more on data retrieval issues and Karakostas more on the end-results and building up histories of trade lanes for risk analysis. We have not found information on concrete designs towards an implementation. This may result from the running CORE-project. Hofman (2015) recently published a paper on federated infrastructure for the data pipeline.

3.7. Intermediate conclusion

The data pipeline concept has evolved most prominently from research projects so far. It seems to set the mark for creating visibility and interoperability with authorities. This could lead to the paradigm shift in B2G communication where authorities will use data pull mechanisms. The concept of linked data has been touched firstly as a method to fulfil accessibility to data and data pull.

The data pull method is also mentioned implicitly in the UCC. Article 127.8 states: “Customs authorities may accept, instead of the lodging of the entry summary declaration, the lodging of a notification and access to the particulars of an entry summary declaration in the economic operator's computer system.” The access to another computer system and thus a kind of data pull is explicitly allowed for. But such access is limited to particulars of the ENS, so in fact leading to the agreed ‘to be declared’ data. Such a data pull therefore still relies on the closed world assumption.

The CORE project focuses primarily on the data pipeline concept (CORE 2013). It describes 22 projects of which 11 are demonstrator projects for a data pipeline. In practice however the view of what the data pipeline is appears not to be explicit. Alternatives for sharing data in

page 26

Page 27: Discovering TTL using LD - Rolf Nijenhuis

the pipeline are not explored. Some individual connections leading to a dedicated result have been set up, like the ‘Felixstowe pipelines’ in the UK (Eric Geerts, Bram Klievink). It is still unclear whether they will scale up. We have the impression that the concept is seen generally as the solution for data sharing while it is not yet clear how it will operate.Data pipelines should be set up as a business initiative. But the incentives for businesses to realize it are not obvious. We have noted that for some businesses, such as information brokers who depend on the lack of transparency for their services to be commercially viable, it would be detrimental. The involvement of customs with the incentives to improve risk assessments by obtaining more data is much more obvious. Although the parties collaborating in living labs have proven to function very well in creating a cooperative environment based on trust, it seems that authorities like customs are most active in its promotion.

Some research on Linked data has been performed of which the Knowledge Graph is most recent and most eye-catching. But these researched abstract architectures have not landed in realistic designs so far and the outcomes have to be waited for. They have not explicitly taken account for open world initiatives and could also be effective making use of closed world systems. We observe that the OWA, which may serve a crucial role in customs risk management, could play a more prominent role in research. We hope to give some direction for this by proposing the Guided-OWA in chapter .

An important issue for Dutch customs in the CORE trade lanes now is that they do not want to store intelligence data used in risk assessments other than the traditional declared data. In this sense Dutch customs would not directly agree with Karakostas’ proposal which defined the goal to build a history of knowledge.

Our research sub question to answer here is: what are state of the art findings in improving interoperability and visibility of global supply chains by IT and Linked Data?

We have noted that the concepts of data pull and data pipeline are eminent within the EU Projects. The data pipeline architecture promotes connectivity by generic interfaces. It proves a difficult concept to realize and seems to hold an internal contradiction. It should be a not existing virtual bus, but at the same time it should have some central infrastructure, a data sharing area and its own configuration management. Although this concept is very prominently addressed, demonstrating results and generic prototypes still have to be waited for.

We also found that creating and improving interoperability cannot be done by means of IT only. This was projected on the Information Infrastructure. But in practice IT has been investigated as the means to create connectivity between trade partners, making use of interfacing by web services and API’s. Linked data had been addressed on a small basis and primarily as a method to realize data pulling. But just recently a ‘semantic web impulse’ has evolved from EU-Project CORE in the form of a desired knowledge graph.

In this paper I investigate and elaborate further on what linked data in full essence is and I propose a design of how customs could directly benefit of it. The focus in this design will be on getting to know the business stakeholders and discovering Trusted Trade Lanes. We will examine how this design will fit the researched innovative concepts.

page 27

Page 28: Discovering TTL using LD - Rolf Nijenhuis

4. Web of Data

4.1. Web of data for interoperability

The web of data represents the vision that the huge increasing amount of data available on the web will be interconnected globally. The meaning of data, semantics, will also be stored in data, by making use of vocabularies and ontologies. This allows for knowledge to grow in an automated way and in incremental steps.

The great achievement of the world wide web, the web of documents as we know it now, is that it lets us browse via hyperlinks to other sites or documents, without any notable effort. The promise of the web of data is that it will allow us to use the knowledge within documents or sites or databases right away, without any effort.The practice of linked data technologies is expanding, but is still limited to specific domains and interests. Like governments making more and more data public, libraries and education, life sciences and social media. A cross domain example forms dbpedia (dbp; 2015), providing the data on Wikipedia in structured open form (Heath and Bizer 2011).

The web of data has the potential to improve the interoperability between businesses and with governmental authorities. Figure 10 below depicts an evolution in interoperability. Businesses are used to have their own IT-systems behind firewalls and to communicate via documents, physically or digitally (see I). More advanced they communicate via data exchange. This is according the message paradigm as practised between businesses and customs (II). The downside of this is the duplication of data and the chance of lack in actuality of data.

In the web of data view (III) the individual databases are opened in a controlled way, providing actual information according to agreed semantics that are open and conform international standard.

Figure 10 Towards a new interoperability via the web of data (according to: verhelst; 2015)

The definition ‘the Semantic Web is a webby way to link data’ (Dave Beckett) comprises brilliantly the concept of linked data. It catches the main idea in the linked data paradigm: using the web model to publish and connect raw data (cambr; 2015).At the same time figure 10 makes clear that the steps from left to right directly imply the shift from closed world to open world systems.

page 28

Page 29: Discovering TTL using LD - Rolf Nijenhuis

The semantic web is providing the standard for integration and combination of data of real world objects via the web, but it also provides the language for recording how these data relate to their real world objects by defining representations of these in ontologies.

Figure 11 Linking raw data (Mendes, Mühleisen et al. 2012)

Figure 11 shows the basic possibilities of linking and linking again from data source to data source. These links are not implemented at once but may evolve like in an organic network. This figure shows the essence and innovative power of linked data. It can enlarge and create new information by connecting the underlying individual characteristics of existing data sets. This way of cross linking data elements has not yet been possible. It creates opportunities to define new information and to discover new insights, for which the data is already available.

4.2. The foundations and concepts

The web of data is built on three foundations: RDF, SPARQL and OWL (Allemang and Hendler 2011). RDF (Resource Description Framework) to model the data to store in order to link it via the web, SPARQL as the RDF query language for linked data, like SQL is for relational data, and OWL (Web Ontology Language) to meaningful describe data for specific domains, modelled in RDF itself.They are open standards guided by the W3C. We will discuss them and use them primitively in our research design in order to show these principles. Implicitly these foundations also hold the aspects of reasoning (according semantics) and the OWA. The shift from CWA towards OWA will be important to take by businesses and authorities.

Linked data is evolving still and started to develop from 2007 (wiki-ld; 2015). It has merged with LOD (Bauer and Kaltenböck 2011). The initiatives of governments to be more open to the outside world resulted in the Open Government Data (OGD) movement to open up government / public administration data. This movement has adopted the semantic web foundations, which put the L of Linked before the Open Data initiative of governments.

In the development of publishing data and data as links to be used by others, a differentiation of sources is natural. Tim Berners-Lee (tbl; 2015) has described the continuum in publishing data applying to the LOD and semantic web standards in a five-star ranking scheme. One star means: make your stuff available on the Web in any format (for people to see). The ultimate five star stage is that you provide data in RDF-format and linked to other sources (use linked RDF). For our research design we assume to be able to link to data which is machine readable using an open standard.

Berners-Lee later added his four ‘linked data rules’:1. Use URI’s for names of things2. Use HTTP URI’s so that people can look up those names (the URI’s are dereferenceable)3. When someone looks up a URI, provide useful information, using the standards (RDF,

SPARQL)4. Include links to other URI’s so that they can discover more things.

page 29

Page 30: Discovering TTL using LD - Rolf Nijenhuis

URI stands for Uniform Resource Identifier. It means to uniquely address a resource connected through the web. The term URL is more commonly used. It is a specific URI that identifies a location on the web, where a resource may be found. In practice these acronyms are often mixed up. Also used are URN (Uniform Resource Name) to address a name and IRI (International Resource Identifier). IRI is a generalization of URI and fully supports international characters.A HTTP URI is an identifier that people can use to show the content in their browser. This means that the URI is dereferenceable: internet protocols (like HTTP) can be used to retrieve the resource.A URI example is: http://linkeddata.openlinksw.com/about/Berlin#this

Rule 1 really means that we should think and act in terms of (data of) ‘things’ that are linked to other (data of) things (resources). Rule 2 means that humans must be able to understand, check and interpret the resources. Rule 3 means that rationales, definitions and semantics of the resources and its links must be available in the RDF standard, in ontology or other vocabulary. Rule 4 represents the ultimate goal of linked data: organic growth of domain networks and knowledge.

This notion of rules and aspects should be guiding in discussions on what linked data really is. One could easily think that it is a method which links data on the web to other data. And because this is the common way of thinking in the web of documents, this misconception is easily formed. We will address the semantic web foundations in the next 3 paragraphs.

4.3. Resource Description Framework (RDF)

RDF is the foundation of the semantic web, and all other standards of it depend on it. It is a framework that builds on the simple basis of a property value mechanism and is formed by subject – predicate – object structures, like ‘person – owns – car’ and ‘carrier – has declared – ENS’. These structures express information about resources which can be anything, including documents, people, physical objects and abstract objects.RDF is intended for application processing of information on the web. It provides a common framework for expressing the information so it can be exchanged between applications without loss of meaning (w3c-rdf; 2015).

RDF facilitates merging of data even if the underlying schemas differ. It extends the linking structure of the web to use URI’s to name the relationship between things as well as the two ends of the link (‘triple’). Using this simple model, it allows structure and semi-structured data to be mixed, exposed and shared across different applications (w3c-rdf; 2015).

RDF can be used for human communication and understanding as well as for automation by tooling. And this can lead to new knowledge by explanation and even prediction (Allemang and Hendler 2011).

The simple basic form allows for modelling complex semantics. It was originally developed under the W3C to model meta data and semantics. But it is now used also to represent and model data-links in general. At the same time the RDF addresses the fundamental issue of managing distributed data over the web. In the semantic web we refer to ‘things’ in the world as resources. It can be anything that someone might want to talk about; we could also name these objects or things. Note that the properties of objects, or predicates in RDF vocabulary, are themselves resources that describe relations between resources.

The subject-predicate-object structures are statements about things and also called ‘triples’. RDF allows one to express any statement about any resource, and that anything with a URI can be a resource (Antoniou and Van Harmelen 2004). The databases that store data in the RDF format are called RDF data stores or triple-stores.  Unlike a relational database, a triple-store is optimized for the storage and retrieval of triples (wiki-tr; 2015).For analysis of our case-study we have used the triple-store technology of AllegroGraph (allegro; 2015). For a visual presentation of RDF for meta data and regular data, we will give an example using our case study supply chain of roses.

page 30

Page 31: Discovering TTL using LD - Rolf Nijenhuis

Some regular triples are:

(subject) (predicate) (object)Grower ‘X’ grows Roses ‘Baccara’ Consolidator ‘Y’ stows Container ‘100’ Carrier ‘Z’ carries Container ‘100’

The meta level of data defining the contents of the ‘real’ data above, is also modelled in triples.

(subject) (predicate) (object)Business has Business Type Business Type is ‘grows’ or ‘stows’ or ‘carries’

This meta level allows for adding various semantic rules and restrictions. For instance that a business that grows flowers cannot be also in the carrier business, if this would be true in the domain.

There exist some standard ways to notate the triple-data, so called serializations: RDF/XML, N-Triples and Turtle. They all contain the property value descriptions and references to ontologies to be able to fully find (connect to) and understand the data as contained in a set of triples. For our case study we will use the Turtle serialization.

Serializations present the data in a tabular like form. Sometimes it is more convenient to view the triples as a directed graph. This method originates from mathematics (graph theory), and is powerful to get overview by visual presentation. It is specifically interesting when more than one triple refer to the same thing, creating diverted links. The objects and subjects are presented as nodes, and the predicates as label to the graphs.

Next figure 12 presents a graph of the stakeholders involved in a trade lane of import of roses. It serves to illustrate the overview and insights which graphs may give.

Figure 12 Graph on (sometimes anonymized) stakeholders in a supply chain

The underlying data were inserted from script into a triple-store. They had been gathered manually within the trusted living lab environment of the CORE-project. The figure shows a result of opening the black box of supply chain stakeholders. In chapter 5 we will elaborate on discovering the (trusted) trade lanes using linked data, based on this trade lane case-study. Heath and Bizer (2011) have listed the main benefits of using the RDF data model in a linked data context. We list the ones that are of direct relevance to our research subject concerning visibility of the supply chain.

page 31

Page 32: Discovering TTL using LD - Rolf Nijenhuis

Information from different sources can easily be combined by merging the two sets of triples into a single graph

RDF allows to represent information combining terms of different vocabularies RDF allows to model in different levels of structure, meaning that tightly structured

data as well as semi-structured data can be represented

Framework for interoperability

A very important feature is that the RDF allows for linking and tying up data of all kinds of forms. Not only by sharing or reusing vocabularies, as was indicated by the second bullet above, but also by connecting all kinds of data storages in a standard way. This aspect of data federation of RDF is well noticed by Allemang and Hendler (2011) who we quote below.

‘The RDF data model was designed from the beginning with data federation in mind. Information from any source is converted into a set of triples so that data federation of any kind—spreadsheets and XML, database tables and web pages—is accomplished with a single mechanism. As shown in figure below this strategy of federation converts information from multiple sources into a single format and then combines all the information into a single store. This is in contrast to a federation strategy in which the application queries each source using a method corresponding to that format. RDF does not refer to a file format or a particular language for encoding data but rather to the data model of representing information in triples. It is this feature of RDF that allows data to be federated in this way. The mechanism for merging this information, and the details of the RDF data model, can be encapsulated into a piece of software—the RDF store—to be used as a building block for applications.’

Figure 13 RDF-application architecture (Allemang and Hendler 2011)

The application architecture above shows the connection of RDF-files (serializations in for instance Turtle format) to RDF triple stores. These files may also be present to be shared on the web, like a shared ontology. They may exist in several types, which RDF can bind to improve interoperability. We will use this architecture in our design. The sketched application could be a customs risk assessment application.

page 32

Page 33: Discovering TTL using LD - Rolf Nijenhuis

Bottom up analysis

Quite another advantage of RDF is realized at the Dutch Tax and Customs Administration (DTCA). Data of different national sources, varying from the tax office itself to registration offices of ownerships of real estates and automobiles, banks and the chamber of commerce, are loaded in RDF-format into triple-stores. By using common identifiers like the ‘citizen service number’ and loading these data in triples, new insights by connections between these data are detected and shown in graphs. In this way frauds are detected that would not have been found so easily in traditional data handling. New ways of supervision are created based on findings. This method of making use of RDF to detect and create insights in activities by Dutch citizens is not per se making use of linked data as we define it here. It is a bottom up method based on real data to see what connections can be found. In this sense it shows the power of RDF and graphs to discover, gain insight and knowledge (Marcel van Mackelenbergh). RDF has in this sense proven its maturity for usage in e-Government applications.

4.4. Domain modelling

The Web Ontology Language (OWL) is the W3C standard to model a domain of the real world, in RDF format, and to create ontologies. An ontology is an explicit and formal specification of a conceptualization (Staab and Studer 2010). It creates a shared understanding of concepts and allows for formal usage and reasoning by tools. By means of adding semantics to our data we can turn the dumb web into a smart web (Allemang and Hendler 2011).

In modelling triples it is common to make use of and refer to different ontologies. In fact it is a big advantage to reuse the domain knowledge of different aspects in new ontologies, offered by different parties. An example of often reused ontologies are the Dublin Core ontologies like schema.org (dcore; 2015). In general ontologies are published on the web for reuse, as are the triple-store data to be used in linking data.

An ontology is a semantic model. We can distinguish some different kinds of semantic models, which describe what is meant by certain data or concepts (verhelst; 2015).

- glossary flat list with concepts and descriptions- thesaurus concepts, definition, alternate notations, broader/narrower relations- taxonomy classification of properties, tree-structure; sub/super types- ontology complex network of concepts, properties and relations

SKOS (Simple Knowledge Organization System) is the W3C recommendation for defining simpler semantic models like glossaries, thesauri and taxonomies.Another W3C standard is RDFS (RDF Schema). This domain-language provides some semantic modelling options as a set of classes with certain properties. In general we can state that the complexity of vocabularies and their formality grow in the order of the next enumeration: RDF, SKOS, RDFS and OWL. An ontology has the options for adding logic, deduction and inference to a model.

Ontology modelling in OWL will confront the user with complexities that might not be necessary for the specific domain to model. There are tools which make it fairly easy to define ontologies (Protege, Topbraid Composer and others). If the domains to model are fairly direct then it is advisable to make use of the more simpler variants like SKOS. Through fixing the semantics of certain ingredients, RDF/RDFS enables us to model particular domains. An ontology is mostly not necessary and a simple vocabulary can model and explain the domain. But modelling will remain a craft. (verhelst; 2015). For our case study model we have used the tool Topbraid Composer to define a simple ontology of trade lane stakeholders supporting our research design (tbraid; 2015).

page 33

Page 34: Discovering TTL using LD - Rolf Nijenhuis

It is good to realize that partial solutions will work in the vision of the semantic web and its open world assumption. You don’t have to model the complete world of a domain. It is better to start small and build on sharing knowledge and re-usage of agreed other ontologies.

Daniele and Ferreira Pires (2013) have proposed an approach of networked ontologies based on a core ontology containing the main domain aspects of business logistics. They propose to combine this top-down practice with pragmatic bottom up practices. Loukakos and Setchi (2010) have researched the customs domain in trying to define a sample core ontology in OWL. They confirmed the complexity of it and stressed to develop ontologies as an iterative process with intermediate validations by the several parties involved.

Heath and Bizer (2011) list some best practices when defining vocabularies. The main aspects of these are to be open to the world and to re-use (link to) vocabularies and terms that have already been agreed and defined. The WCO Data model could be used as a vocabulary in this sense (see below). The web of data advocates that we should think and act more globally and practically than in traditional closed world island designs (‘from data islands to a global data space’).

The domain modelling for closed systems really differs from that for open systems in ontologies. In open system we apply the so called monotonic logic. This logic does not allow that domain conclusions will change when new information becomes available: the logic keeps the same ‘tone’. We will further address this aspect below and in paragraph 4.7.

The WCO data model

The domain of customs and cross border trade makes use of many standards already. The most important of these are given by the WCO-data model (WCO; 2015). It is described as ‘a maximum set of carefully combined and harmonized data requirements derived from cross border regulations’ and as a ‘toolbox’. It is consistent with and makes use of other international standards as UNTDED and UN/CEFACT (un-code; 2015).

The downside of this model is that it is not a real model but a set of components and best and harmonized practices. It is not an open standard but it is supported by a proprietary tool (GEFEG). For this reason not every EU-member state will support it.

Ontologies bring the opportunity to bring the WCO achievements and UN/CEFACT standards to an open and shared level. The many standard code lists that they contain may be published in a core ontology and vocabularies by which they are made public and free to use. So it is not a question which to use, the WCO data model or ontologies. They may form a good combination by which the data model will be reformed into a broader open and standard context.

page 34

Page 35: Discovering TTL using LD - Rolf Nijenhuis

Semantics and reasoning

The process of describing an open, web of data domain can proceed incrementally, sequentially asserting new statements or conditions, following monotonic logic. The ontology consists of sets of statements (axioms) that describe characteristics that must be satisfied by the ontology designer's idea of ‘reasonable’ states of the world’ (Patel-Schneider and Horrocks (2007).

Formal or logical semantics is the study of the semantics of formal and natural languages. The main modern approach in it, after Aristotle, is model-theoretic semantics (wiki-sem; 2015) which is based on first order predicate logic. Ontologies making use of a formal ontology language (like OWL), represent the form of vocabularies that may contain most relevant semantics or meanings of concepts. Such semantics based on defined logic may lead to new insights in data and even new data. We call this inference.For example, suppose we know the following facts:

John is the father of Marc; Mary is the mother of Marc; Julia is the full sister of Marc.Then we may infer the knowledge (and data in the form of new triples) that Julia is the child of John and Julia is the child of Mary. In this way the real data, sometimes called the asserted data, conforming to an ontology may be extended by inference, creating the inferred data. Inferring data or knowledge is also called semantic reasoning. It is performed by software able to infer logical consequences from a set of asserted facts or axioms.

The web of data adheres to the OWA (Open World Assumption) and reasoning on an open world domain implies the usage of monotonic logic by nature. This means that once assertions have been defined they may never be retracted or reduced by stating new axioms. Adding new formulas will never lead to a reduction of consequences. In other words, the discovery of new knowledge cannot change what is known and is always in line with existing knowledge. In practice available knowledge will often be incomplete. However, to model common sense reasoning, it is necessary to be able to jump to plausible conclusions from the given knowledge. Non-monotonic reasoning deals with the problem of deriving plausible conclusions.

The CWA (Closed World Assumption) is related to logic based on data in databases used in traditional closed IT-systems. Only positive information is represented explicitly. If a positive fact is not present in the database then it is assumed that its negation holds. Only provable facts are held true. The default rule is that negation of facts is held as a failure (NAF). Non-monotonic logic is useful for presenting defaults. A default is a rule that can be used unless it overridden by an exception.The world is an uncertain place. Data which is used to make decisions may be incomplete, inconsistent and may change. Under the CWA uncertainties are strictly dealt with by relying on the content of the database and apply the NAF inference. Also under the OWA there exist methods to deal with uncertainties, like statistical and fuzzy logic methods (artint; 2015).

In our research design we will focus on discovering trusted trade lanes and we suggest to do this by linked data and (statistical) data analytics. We propose to let customs define reasoning rules by which added information is inferred and they will gain insights in levels of trust to apply to shipments by specific trade lanes.

4.5. SPARQL

SPARQL is the third foundation of the web of data.It stands for SPARQL’ Protocol And RDF Query Language. RDF is a well-structured data model that is very suitable for tooling (Heath and Bizer 2011). SPARQL is the standard query language on RDF used by tooling. SPARQL has let people access a wide variety of public data and has provided easier integration of data silos within many enterprises (DuCharme 2013). In essence SPARQL seems rather simple because of the RDF-structure in triples. But,

page 35

Page 36: Discovering TTL using LD - Rolf Nijenhuis

as all programming languages, it is in fact much more complicated. To get an impression of its essence and power we will just touch upon it and give a small example relating to our case study. In our research design we assume that linked data sources will be queried using SPARQL. The syntax much resembles the Turtle format as we referenced earlier. We will provide some examples of both.

The SPARQL query language includes a protocol for communicating queries and results so that a query engine can act as a web service. We call such web services that conform to SPAQL-protocols SPARQL endpoints. They form interfaces towards a knowledge base (sem; 2015). It is even possible to provide SPARQL access to databases that are not triple stores, effectively translating SPARQL queries into the query language of the underlying store. The W3C is standardizing the translation from SPARQL to SQL for relational stores (Allemang and Hendler 2011). So it really appears that SPARQL is the query language of the future.

Example

For creating an example in SPARQL we have set up a small ontology. It is presented in Annex 1 in Turtle-layout, and defines supply chain stakeholders and their identifications as declared to Dutch customs. This contains the definitions in classes following the OWL-formats. The ontology reuses an (W3C) OWL-ontology for this, together with some other basic ontologies for definitions, to show the reuse of vocabularies. The Turtle file in Annex 1 also contains some real data examples and some explanations have been added (in italic).

After loading these data into a triple-store, we use SPARQL to query them. The next basic query statement simply lists all triples.

Select ?s ?p ?o (?s ?p ?o)It states: select all subjects, predicates and objects. The last part in parentheses is the delimiting part forming the ‘where’ clause. If we supply a specific value then only triples with this value are requested. For example all triples which contain the stakeholder FloraHolland as the subject.

Select ?s ?p ?o (FloraHolland ?p ?o)

The result of the first statement, selecting all triples of our example, is listed below.

S p O

has_reported_id rdfs:range xs:integerhas_reported_id rdfs:range Reported_IDhas_reported_id rdfs:domain Stakeholder_via_Customshas_reported_id rdf:type owl:DatatypePropertyStakeholder_via_Customs rdfs:subClassOf owl:Thing

Stakeholder_via_Customs rdf:type owl:Class

Reported_ID rdfs:subClassOf owl:ThingReported_ID rdf:type owl:ClassGrower_A has_reported_id 333Grower_A rdf:type Stakeholder_via_CustomsFF_A has_reported_id 222FF_A rdf:type Stakeholder_via_CustomsFloraHolland has_reported_id 111FloraHolland rdf:type Stakeholder_via_Customs333 rdf:type Reported_ID222 rdf:type Reported_ID111 rdf:type Reported_IDDutchCustoms owl:versionInfo "Created with TopBraid Composer"

page 36

Page 37: Discovering TTL using LD - Rolf Nijenhuis

DutchCustoms rdf:type owl:Ontology

A main graph view of these data, with a selection of the triples, is shown in the figure below.

Figure 14 Example Graph ‘has_reported_id’

The picture below presents a similar registration. But now we take as example that these business registrations are held by an external source like Dun and Bradstreet (d&b; 2015). This source holds several business information which could be useful to check by customs. Let us assume that Dun and Bradstreet publish their data as linked data, or that it offers their data via a SPARQL-endpoint, optionally registered in another format than RDF.

Customs could fetch extra information concerning stakeholders. Now suppose that customs would want to check the registration of Grower_A within Dun and Bradstreet. A simple statement (‘ASK’) referring to the above indicated triple will return a true (registration is found) or a false (not found).

prefix c:<http://DutchCustoms#>prefix d:<http://DunBradstreet#>ASK {d:Grower_A d: is_registered_by ?dun_id}

In the above situation the result would be ‘true’.Note that for ease of notation SPARQL allows to use prefixes, just like in the Turtle of Annex 1: @prefix owl: <http://www.w3.org/2002/07/owl#>

These prefixes refer to so called namespaces. These are defined by and belong to a single authority and may also serve as a URL. Namespaces allow for different agents to use the same word in different ways, as described in ontologies (Allemang and Hendler 2011). In the above example Dun and Bradstreet would have described its data in an ontology under the namespace ‘http://DunBradstreet#’ and published it under this URL.

The example is rather simplistic and is only to show the basic of SPARQL. We just used the names of businesses as an ID but in practice these are numbers like the eori-number. In practice it will be a challenge to find or create unique references. This is relatively easy in a closed world working with the Unique Name Assumption (see next par. 4.7), but more difficult in an open environment.

page 37

Page 38: Discovering TTL using LD - Rolf Nijenhuis

As an example of a linked data application we refer to optional selections of all kinds of information on proteins (uniprot; 2015).

4.6. Initiatives in practice

The linked data concept has not been addressed intensively from CASSANDRA and CORE so far. But interestingly it did get attention in other EU initiatives. In our opinion this is because the starting point of these projects is the (open) available data to link to. For improving supply chains the data is not there yet, and the options to improve interoperability should be the starting point. We will not elaborate but just reference them here.

- LOD2 (lod2; 2015)

- LDBC (Linked Data Benchmark Council, ldbc; 2015)

- European Ontology Network (euon; 2015)

4.7. Adapting to the Open World

The web of data is open by nature and uses the Open World Assumption (OWA), where traditional IT is closed by nature (CWA). We can distinguish between open world and closed world problems, although many domains have aspects of both. We could consider for example as a closed type: ’which trains run today between x and y’, and ‘which stakeholder is the lodger of the ENS’. A clear definite answer is obvious and obtainable. A more open type of question would be: ‘can we consider shipment x of goods to represent a trusted trade lane?’, because customs will not have stored all data and knowledge in his database to decide on this. Closed systems are very suitable for constraining and validating enumerated data. Irregularity and incompleteness are toxic to the closed world relational model design. Any fact not known is regarded to be false (NAF: Negation as Failure).

As an example of a closed world approach in customs processes we refer to a rule that the declared Custom Commodity Code (digit code for the goods) will immediately invoke a safety and security inspection, if it is part of an enumerated list. Such a true/false condition could be defined looser in an axiom that would take in account other more open aspects as well. Like if the specific supply chain is realized by a trusted trade lane or that the purpose of the goods flow is known by additional information.

An open world approach is required for processes which are underspecified, which should be easily reused and extended, which are involved in knowledge building and are incomplete by default. Any fact not known is regarded to be not known, but may be easily added into the system (manch; 2015).

Bergman (bergman; 2015) has investigated that knowledge management processes are best supported by open practices. He has listed the OWA as just one contrast between closed practices (‘the relational approach’) and open practices (‘the semantic web approach’). He based his vision largely on findings of Patel-Schneider and Horrocks (2007). See the list of contrasts below.

page 38

Page 39: Discovering TTL using LD - Rolf Nijenhuis

Relational approach Semantic web approachClosed World AssumptionEverything is prohibited until it is permitted

Open World AssumptionEverything is permitted until it is prohibited

Unique Name AssumptionDifferent name is different entity

Duplicate labels allowedIdentity assertions must be explicitly stated

Single schemaOne world

Many world interpretationsData depends on (multiple) interpretations

Integrity constraintsTo prevent ‘incorrect’ values

Logical axioms (restrictions)Restrictions through domains and ranges. Everything can be true unless proven otherwise

Non-monotonic logicThe number of conclusions might shrink with the size of the knowledge base

Monotonic logicNew piece of knowledge cannot reduce what is known

Fixed and brittleChanges require re-architecting the database

Reusable and extensibleDesigned to be reusable. Network modelling and agile database design

Flat structure; strong typingFlat tables; foreign keys; strong data typing orientation

Graph structure; open typingGraphs supporting linkage and connectivity analysis; data types treated as classes.

Figure 15 Comparing the closed and open practices (bergman; 2015)

Bergman claims that the closed practices, using relational databases, prove to be inward bound and fixed and dedicated to tight structures. The open practices of the semantic web need flexibility in adding findings and adapting to new knowledge and change; designs are reusable and extendible. This supports the general advice concerning ontology designs (paragraph 4.4 )to start small and reuse other ontologies.

Bergman states in general that CWA would make a very poor choice when attempting to combine information from multiple sources, to deal with uncertainty or incompleteness in the world, or to try to integrate internal, proprietary information with external data. He has noted that traditional closed systems, in the fields of business intelligence, data integration and knowledge management, show over-engineered schema, too complicated architectures, massive specification efforts and high deployment costs. He advocates the open approach as an incremental, low risk means to knowledge management.

Drummond and Shearer (manch; 2015) have also compared the open and closed world assumptions. They conclude that both approaches have their strong points and that a switch from closed to open should be considered carefully. Some problems are inherently closed world, like when dealing with a finite number of elements. They suggest the option of ‘closure of open world’ by defining the limitations on the OWA and in that way bridging the two worlds. We will make use of this suggestion in our design proposal.

In general we have the opinion that the risk assessment process under research here would be served by the open world aspects. Global supply chains are constantly evolving and re-arranging. The capturing of this world by strict constraining rules and conditions only is difficult. They may cause too rigid effects in execution and it will mostly run behind actual developments. Strict rules on the other hand may of course also serve to apply uniform rules.

page 39

Page 40: Discovering TTL using LD - Rolf Nijenhuis

4.8. Intermediate conclusion

Linked data offers by nature high potential for improving interoperability, by creating links to data with agreed semantics. Many initiatives are taking place to open up the web by providing data to link to. Most applications are involved in presenting data from governments and organizations for general benefit. Linked data starts with initiatives of publishing data and their semantics.

Open systems, OWA and monotonic logic are required to apply linked data. These systems will be most suitable for knowledge management processes. Traditional closed systems may be very much suited for closed world problems. We conclude that an open approach would be suited for customs risk management under research here.

Incremental growth is facilitated, not only by applying monotonic logic in domain ontology designs. Data definitions and data their selves may expand easily as opposed to using traditional tables which have locking boundaries. Semantics and the reuse of common accepted ontologies may be used for fast and broad adaption. In this way not only new data paths and links may easily be added, but also semantics. Another plus is that it allows for an easy plugin facility for code tables in the form of vocabularies.

But still, businesses are not eager to use linked data yet (Lieke Verhelst). They seem to be reserved in sharing their business data on the web, perhaps with reason. Maybe it will just take time for growing awareness before businesses will realize possible benefits, just like the web now took some time for businesses to take advantage.

In this research we will not focus on the aspects of (un)willingness to participate in sharing data and of authorization and security. In general we take the position of looking at possibilities of using linked data in supply chains to improve the functioning of the whole. We assume that the issues of trust and privacy will resolve on some scale, in some way some time, when the advantages for interoperability will become demonstrable.

Our research sub question to answer here is: what are the characteristics of linked data that could improve mutual interoperability of businesses and cross border authorities like customs? We have looked at these promising characteristics. In fact we believe the web of data has the potential to shape the desired data pipeline. Without bothering about central IT systems or a central public – private governance model. Parties should install innovative software to make proper use of the linked data (RDF data store, SPARQL), and smaller parties may easily supply their data on the web. RDF offers a framework of interoperability by which all kinds of data in all kinds of forms may be linked. In practice the governance by brokering parties like port and business community systems would be a logical development. Before implementation of open linked data systems can be realized, a shift towards open system approaches is needed. This may prove a challenge, but it can be done in smaller incremental steps.

page 40

Page 41: Discovering TTL using LD - Rolf Nijenhuis

5. Design proposal and case study

We will use a case study as a reference to show how we could make use of linked data. In fact this study is used in CORE to demonstrate a customs dashboard making use of Optional Filing. We consider the design proposal hereafter to be regarded as a general direction for customs risk management in maritime global supply chains. For this design we will use the conceptual contribution of ‘GOWA’ (Guided Open World Assumption).

5.1 The Guided Open World Assumption

In the previous chapter we discussed the OWA that is obvious to apply in knowledge systems concerned with reasoning and decision making when using the open data of the web. The applied logic must take notion of facts that are not known to the decision making process in a point of time but may effect it. When using linked data we must prepare for additional data to be provided that may influence the statements to make concerning a domain situation. At the same time we must make and underpin plausible conclusions at a certain moment of time. We propose the Guided-OWA (GOWA) as a means to ease the adoption of the OWA by the traditional closed world processes of customs risk management.

For a good functioning of finding appropriate data on the web, applications, and humans, should have some clues where to look for data. Dealing with a clear number of links and data sets this will not pose a practical issue. But dealing with the large number of stakeholders concerned with thousands of global transactions per day, customs authorities could use some guidance and control to where to look for the data of specific shipments. The ultimate message paradigm shift, where customs will not receive declaration messages anymore (chapter 3), would imply to fully adopt the OWA. Parties then should supply customs with SPARQL-endpoints connecting to supply chain data and customs should install applications with data-crawling mechanisms to continuously search for specific information. It will be far more practical to start with supplying customs with references to new linked data in the form of handles to be able to perform a directed data pull.

Traditional IT-systems and reasoning have strongly adapted to the CWA, starting from the rise of IT from the 80-ies of the last century (chapter 4). We now may take advantage of the data on the web and turn to the OWA, but only concerning reasoning in an open ended domain. Many domains are well handled under the CWA. That is when the facts are enumerated and no proof can exist of other facts. Customs requires for instance the value of imported goods to be declared, in order to levy fiscal duties. They will register the declared information in the database and subsequently all further actions and decisions concerning tax duties and fines will purely be taken in relation to the available data. This is very efficiently dealt with applying the CWA. But it is another question to determine the risks for safety and security on the declared data only. Additional information on traders and the supply chain will be most welcome here. Just like the customs officer must have an open mind in sensing potential risks.

A complete shift from CWA to OWA cannot be performed drastically. It seems obvious to try and do this in a stepwise manner. We support the ideas in (horrocks; 2015), a blog by Ian Horrocks and Pat Hayes, in which they elaborate on the question ‘why must the web be monotonic’. We already discussed monotonic logic and observed that non-monotonic logic is used for defining plausible conclusions as defaults, in lack of information that proves the contrary (par. 4.4). In the processes of customs risk management we should look for combining the monotonic web reasoning with the closed world reasoning of customs now. Or as Pat Hayes stated in the blog: ‘we need ways for them to co-exist smoothly’. A method to do this is to explicitly state the (non-monotonic) restrictions within the monotonic logic. For instance by stating that the data found on the web, relating to a specific statement, at a URL

page 41

Page 42: Discovering TTL using LD - Rolf Nijenhuis

or a namespace, were exhaustive according to specific search queries. This means that non-monotonic aspects are applied in open world reasoning by adding the restrictions of certain data found under certain conditions. Drummond and Shearer (manch; 2015) also pointed towards such approach and called it ‘closure of the open world’.

In this view, the measure of exhaustiveness of information (gathering), relative to some expectation, can be used in reasoning, for instance in determining the trustworthiness of a trader. If it is relatively low, in comparison with other sources, then the reliability of the determined trustworthiness will also be lower.

The above two aspects may be used to ‘guide’ the adoption of the OWA: to provide for ID’s and seeds to find related data, and to realize a ‘smooth’ co-existence for CWA and OWA. In our research setting concerned with the ENS we may use the closed world declaration process of the ENS to provide for ID’s as a handle. We could even adapt the declaration process to provide for extra information to use in searching for linked data, like certain URL’s directing to Business Community Systems. In this sense we may consider the filing of the ENS as a kind of publish/subscribe event by which customs will receive the location of and access to extra data (par. 3.5).

The goal of GOWA is to ease adoption of open monotonic reasoning (OWA) by closed systems, by explicitly stating non-monotonic aspects (CWA). Closed systems require complete and exhaustive data. Whereas open systems are provided with mechanisms and techniques to handle incomplete data and provide for ease of adding new data and knowledge.We define the Guided-OWA as follows.

The Guided Open World Assumption (GOWA) is the principle of allowing potentially unknown sources of data (OWA), in a traditional closed world environment, (1) by providing specific identifiers that ease the search-process for data on the web to validate statements, and (2) by explicitly stating the scope of the data thus captured, to make it possible to locally ‘close’ the information and assume exhaustiveness.

By applying GOWA we may gradually adapt traditional systems using negation as a failure (NAF) to use assertions with a level of reliability that depends on the exhaustiveness of available data. This research is concerned with customs risk management processes on safety and security. The GOWA may give direction to taking an open world perspective in these processes and to help automating and improving the function of customs ‘trained eyes’.Note that the decision making process in customs risk management is confidential and needs not be justified but is to be kept secret from the outside world. This aspect in fact makes this process the ideal candidate to involve in a modern open approach.

We pose the GOWA as a principled solution for addressing the actual ENS problems. It bridges the gap between a complete open system under OWA, which is still unrealistic, and the traditional closed world methods that are used today and have proven insufficient for risk management. Customs will have to apply OWA in usage of data available via the web, and the appliance of linked data will leverage this urge. Next figure 16 visualizes the GOWA.

page 42

Page 43: Discovering TTL using LD - Rolf Nijenhuis

Figure 16 GOWA for risk management, combining OWA with ENS

The closed world aspects in declaring the ENS by the sea-carrier are set up front. Customs responds by issuing an MRN (Movement Reference Number) as the identifier of the shipment in further B2G-communications. We also sketched the possibility of the IT-system of the carrier to connect to a business data pipe line, in order to show that such connection using interfaces would link closed systems. Customs will use the MRN and extra data from the ENS (seeds, filers of data, URI’s) to efficiently retrieve linked data. The GOWA allows to adapt to monotonic reasoning by explicitly stating the restrictions by closed world aspects to it. Such restrictions may be the listing of the URI’s and seeds that were used together with the specific queries. We will apply and further explain GOWA in the following design.

5.2 Design proposal for discovering Trusted Trade Lanes

Customs risk management processes, in the Netherlands, consist of two phases.The main automated process is fed by so-called risk-profiles. These represent the factors which customs have defined as important in weighing the expected risks compared to the data of shipments, as received by the ENS declarations. These weighing factors consist for instance of some specific countries in combination with the imported goods which represent actual risks on narcotics smuggle. This process determines the containers which should be inspected (marked ‘red’) and which are candidate to be inspected (‘orange’).Marked ‘orange’ means that customs officers (‘the trained eye’) should decide to inspect or not, just a few hours before the shipment will enter the first port of entry. For this risk

page 43

Page 44: Discovering TTL using LD - Rolf Nijenhuis

assessment the customs officers use their knowledge, their information sources and their common sense.

Our design proposal focusses on improved informing of these customs officers by making use of added information for the shipments and containers to decide on, provided on a customs dashboard. The added information being acquired by using linked data in optional filing.

Customs dashboard requirements

In interviewing customs officers (Wim de Viet) they were rather restrictive in defining requirements for risk assessments. Reason for this is that these should be kept secret from the outside world. Regarding a customs dashboard they state to want as much information as possible on shipments, which are ‘hopefully reliable’. This last remark really relates to an open world view. It is not a matter of yes or no, but more of rating the available information. The information should lead to better funded decisions than only relying on the ENS data. In general we may characterize the functional requirement to show extra data and in more detail then declared on the ENS. An overview of basic data elements is provided in Annex 5, but even broader data may be considered like financial behaviour of parties and certifications. An important general requirement is that extra data must be arguably connected to a shipment under inspection. These data may connect to the products, the transport units, the routing of the shipments or the involved stakeholders. For this the before mentioned seeds like container and MRN number are necessary to apply in selection filters. We will observe some important (non-functional) requirements in more detail.

Availability and timeliness: data should be available to customs 72 hours before the expected time of arrival in the first port of entry of the EU. This is a legal term for the second risk assessments which customs want to apply to the dashboard also.

Performance: provided extra information should be available when requested within a matter of seconds, to support the human decision making process in a user friendly way.

Adaptability: the dashboard functionality must be able to swiftly adapt to new types of information. It must be supported by an easy to expand data model, because the exact information provisions will not be definable beforehand.

Adoption rate: customs will be helped by a high adoption rate, that is when many businesses will provide data. In order to achieve this the procedures and required technology should be relatively easy and of low costs.

Jurisdiction: the dashboard functionalities should be in line with EU- and Member States jurisdictions. Dutch customs are reluctant to store data which they have not obtained by declarations. Dutch law does provide for storing data on shipments that it has used to reach specific decisions in cases of extra ordinary actions. But jurisdiction on storing data is still an issue within Dutch customs. This research did not investigate in the implications of other Member States.

We will re-address these requirements from the case-study point of view hereafter.

page 44

Page 45: Discovering TTL using LD - Rolf Nijenhuis

Architecture Linked data for optional filing

The customs dashboard aims at presenting extra data and information, next to the acquired data by the ENS declaration. Next figure 17 presents an architecture for customs dashboard connectivity to linked data for the purpose of optional filing, in the form of an overview of layers. It depicts the ‘ideal’ situation in which all extra optional filings are provided through linked data by triple stores that offer SPARQL-endpoints for ease of access. Stakeholders in the supply chain are shown in the bottom layer. Stakeholder ‘B’ files the ENS declaration as the backbone of the communication process, via traditional electronic messaging. The filing is performed from its IT-system and the declared data are stored in customs database. The customs information dashboard on the upper level uses these data.Stakeholders ‘A’ and ‘C’ may provide for extra data in their own IT-systems. If they do so, they offer references to these data which are held in a triple-store, as shown in the third (orange) layer. The access to these data by customs authorities is pre-formatted by SPARQL-endpoints which offer pre-defined queries.

For linking to data the ‘on the fly dereferencing’ pattern could be too time consuming, if many data should be traced by links (Hofman 2011-2). The performance of the dashboard functionality will be strongly improved by making use of references (seeds; id’s) that connect to the ENS data, by using the principle of GOWA. Note that this architecture implies the sharing of an ontology by customs and businesses and of internal communication between stakeholders. We will address these later with figure 19.

Customs data search processes will be speeded up by their knowledge of the offered SPARQL-endpoints, as could be part of the GOWA. Web services in general may be published in an UDDI (Universal Description, Discovery and Integration) which serves as a public registry of available services. It would certainly enhance the customs process if stakeholders would publish their services in a registry for border agencies. Such registry may be set up by authorities but also by BCS’s and PCS’s. The community systems, or customs data brokers, could serve as facilitator for data storage and data linking.

The triple stores connecting to business data, the next orange layer, in fact form a business data pipeline, if they would also interconnect between business systems. They would also form the basis of the in CORE sketched Knowledge Graph.

page 45

Page 46: Discovering TTL using LD - Rolf Nijenhuis

Figure 17 Architecture ultimate linked data usage

Customs will have installed their own triple stores. These could not only have links to the data pipeline which would allow for piggy backing also. But they could link to other linked data sources as well. Ideally these would consist of other member states data, creating the so required but lacking technical interoperability. Other external sources like the Dun and Bradstreet database could be linked if provided by SPARQL-endpoints. Customs could use these open data to define a level of trust of the trade lane. Clearly the data quality related to these external sources is very important. In using these data for risk assessment purposes customs should weigh the importance of them in relation to the expected quality and define a scaling quality ranking. We will further address defining a trusted trade lane in relation to our case study. Next figure presents a growth model towards the usage of linked data for optional filing.

page 46

Page 47: Discovering TTL using LD - Rolf Nijenhuis

Figure 18 Architecture growth model

It is based on the RDF-application architecture (Allemang and Hendler 2011) as presented in figure 13. The essence of it is that optional filing is supported in various ways. Data may be presented in diverse forms. In turtle files, comma-separated files or excel files. Customs could even implement a web platform for businesses to add data manually. This would create a low threshold for businesses to take part of optional filing. All these data would then be transformed to RDF format. This would allow for the data federation principle to be implemented by using RDF as the uniform overall standard for data communication, according the RDF-application architecture as described in par. 4.3 (Allemang and Hendler 2011).

This design allows for temporary storage of (links to) linked data as an option in order to prevent jurisdictional issues like on privacy of data. Dutch customs is reluctant in building a history of ‘external’ data so far.

page 47

Page 48: Discovering TTL using LD - Rolf Nijenhuis

Important first steps in adopting this architecture will be that customs starts to adapt an open world approach, be it in a limited way. Businesses may start and learn to share data in an open world manner. If several member states would do so then they would be able to improve their interoperability in risk management.

Interaction diagram G2B and B2G using linked data

The general architectures above sketched the data flows from business to customs. The next figure 19 presents a more specific interaction diagram, showing all B2G and G2B communications involved. Four main stakeholders are depicted in this process. The carrier is the party that connects to customs as the traditional declarant. Customs also connects to Informing traders (the optional filers), directly or via brokers (the BCS’s and PCS’s). We depicted the communication making use of a broker.

Figure 19 B2G and G2B interaction for optional filing with linked data and GOWA application

Regular communication starts with step 1: filing of the ENS. But prior to this a general agreement must be set up on what the optional filing will exactly consist of, using an ontology or vocabulary. Also a contractual agreement must be set up between the informing trader and its broker party.

Step 0 should also precede the ENS filing and represents part of the GOWA. The carrier must be informed of who the informing traders for this trade lane may be and how they might provide for the data. The carrier will pass this information to customs to use it as a seed. This information may be the URL of optional filings registered within a brokers IT-system. The carrier will add these data to the ENS declaration that it will lodge to customs (1).

After receiving the ENS, customs will reply to carrier providing the MRN (2). Carrier will share the MRN to the informing traders (3). They will use the MRN as an ID in their optional filings. They will provide their broker with these data (4).

page 48

GOWA

Page 49: Discovering TTL using LD - Rolf Nijenhuis

Now customs may make directed searches for these data, making use of the provided seeds (5). If found they will use the data in presenting the customs dashboard (6). They might also use the data to store in their systems in order to create a history on trade lanes.Finally customs may inform the carrier that the declared shipment should be presented to customs for inspection (7).

The GOWA most directly affects steps 5 and 6, because it directs and eases the search process. But also the ontology agreement and steps 0 and 3 are affected by it. It shapes and directs the communication process.

Design towards trusted trade lanes

The domain of customs in relation to supply chains is very complex. For reasons of getting insights in using linked data technologies we have designed a simple ontology describing a basic domain model. Its purpose is twofold. It serves to register stakeholders involved in a specific (maritime) shipment, together with this shipment. Not only the ones who are notated in the declared ENS, but also stakeholders that have become known by optional filing. The second purpose is to create an indicator, if customs may regard the trade lane that takes care of this shipment as a trusted trade lane (TTL).

The main triples of the ontology are presented as graphs in figure 20 below. The full definition is listed in turtle-format in Annex 2, which is rather self- explaining. We added some options how to add semantics by defining specifics within the ontology as comments (in bold), next to the semantics by describing the domain making itself by the logic in modelling rules.

For ease of presenting just an overview in the figure, the object enumerations (yellow boxes) have been presented loosely, but close to their connected predicates; the registration source is ENS or Optional Filing. The different roles of stakeholders are noted in the same manner.

Figure 20 Simple ontology domain model of stakeholder involvement in a trade lane

For explanation we list some important relations or triples.A stakeholder has a role; a role is one of (consignee, buyer.. etc.); a stakeholder may be optional filer. A stakeholder is involved in a shipment (identified by MRN); a stakeholder may embody a trusted trader. A shipment reveals a trade lane; a trade lane may consist of trusted traders; a trade lane may be trusted. A container is part of a shipment, a bill of lading sources a shipment.

page 49

Page 50: Discovering TTL using LD - Rolf Nijenhuis

Note that this model only serves as a meta model to describe the black box of stakeholders. It has not yet been commented or validated by experts but serves as just a rather primitive example. We would prefer to define a general core ontology of the customs domain first, like promoted also by Daniele and Ferreira Pires (2013). Then this ‘optional filing’ ontology could have reused this core ontology and would extend it for specific additional purpose. But such research exceeds this paper. Such a core ontology would contain many more data elements which could also be obtained as linked data as data on physical objects like containers and data on commercial transactions in the chain.

Now for example, suppose that stakeholder FloraHolland has knowledge of the MRN that customs has provided to Maersk being the ENS-filer, following the GOWA. FloraHolland wishes to file extra data on the growers of roses. The following data could be presented to customs as an URI-link or gathered in an excel-sheet but using the ontology definitions. The excel-sheet will also contain a reference to it, like ‘http://dutchcustoms/ontologies/optfiling’. The application architecture in figure 18 allows for providing data in various formats.

Optional filer MRN Container B/L Stakeholder Role FloraHolland MRN_1234 Cont A BL_11 Gower_A Seller

MRN_1234 FloraHolland Consolidator

These optionally filed data would be loaded into the customs triple store as presented in figure 21 below.

This graph also contains the stakeholders as known via the ENS. One of these is the freight forwarder “FF_A”. We assume that this business proves to be registered as a trusted trader. By automated reasoning or inference of triples the relation “FF_A embodies the trusted trader Trusted_FF_A” could be set up automatically. Also by reasoning the inference may be made that the shipment identified by MRN_1234 is part of the trade lane “roses from Kenya to Holland”. This trade lane may have been defined by customs in advance as the result of an agreement with the trade lane representative (‘meeting room’-model). It could also be set up by reasoning in connecting this shipment to other shipments of roses from Kenya and in designating it as a regular trade lane.

The indicator that this is a trusted trade lane could subsequently be set up by reasoning rules. If for example the number of trusted traders involved in this shipment exceeds an average listed number then it should be set as true. Or reasoning may be based on a ranking of the stakeholders by customs interpretation. For instance based on properties like: not known, open, registered, verified, certified. Such ranking may also be acquired by reasoning. In this way a ‘degree of trust’ in the direction of a TTL is determined.

page 50

Page 51: Discovering TTL using LD - Rolf Nijenhuis

Figure 21 Structure of actual data based on ENS declaration and optional filing

This model serves as an example of how linked data technologies may be used for data and knowledge management within customs authorities. They support optional filing, linking federated data from SPARQL-endpoints or delivered by excel-files, to implement optional filing in a GOWA environment. The model focusses on stakeholders involved in specific shipments that may represent a regular trade lane. The stakeholders become known triggered by customs declarations and by additional optional filing of stakeholders involved. The place of the stakeholders in the supply chain must be determined by the role they are described by.

5.3 Case study fresh roses

General

Cut flowers from other continents are traditionally imported into the Netherlands by air. Their perishability requires a swift transportation. For some years import by deep-sea in reefer containers is starting up on small scale. This concerns the import of roses from Kenya to the Netherlands in which research is done on their perishability. This trade lane is also investigated in the CORE-project in researching interoperability and coordinated border management. The trade lane starts with the growing of roses in Kenya by professional growers. The roses are gathered and packed at Nairobi. They are transported by conditioned lorry to the port of Mombasa. There they are checked by authorities and loaded into containers. These are shipped to the port of Salalah in Oman where they are reloaded onto another ship. The shipment proceeds to the port of Algeciras in Spain, the first port of entry in the EU. From there it continues to the port of Antwerp in Belgium. There the container is loaded onto lorry and transported to the flower auction of Aalsmeer in Holland. There the roses are checked by authorities and unpacked and subsequently directed to the importer.

Figure 22 Container shipment with roses to Antwerp and Holland on its way

Below we list the involved stakeholders in this supply chain. Some take some different roles in the processes. The last four columns indicate which stakeholders are involved in the border

page 51

Page 52: Discovering TTL using LD - Rolf Nijenhuis

crossings and are related to the ENS declaration. For reasons of privacy some names have been anonymized.There are many stakeholders involved; this overview lists the main parties only.Note that the ENS declaration only names four of them: the shipper (or consignor: responsible for the shipping), the carrier, the party to notify at arrival in Antwerp and the importer.

Stakeholder role Other rolesStakeholderBusiness

ExportKenya ENS

Import Belg.

Import NL

Seller 1 Grower of roses Grower_ASeller 2 Grower_B (more)Agent consolidator Freight forwarder;

consignor; shipper, exporter

FH services X X

Agent mar. transport 1 Carrier Mar_Agent_AAgent mar. Transport 2 Mar_Agent_BRoad-carrier Kenya Road-carrier_XSea-carrier Maersk XFreightforwarder Notify party FF_A X XRoad-carrier Road_Carrier_A XAgent Customs Cust_Agent_A XAdressee Sub-Consignee Flora Rijnsburg X XConsignee Importer FloraHolland X XBuyer Importer Not namedImporter Not namedAuthorities Kenya XAuthorities Spain Port of EU entry XAuthorities Belgium XAuthorities Netherlands X

Figure 23 Stakeholder overview trade lane of roses

It is interesting to see that other declarations, for import in Holland and Belgium, declare other parties. One reason for this is that parties who are involved with the ENS are actively involved up to the point of the first port of entry. But customs is really interested in the supply chain as a whole.Note that this trade lane is very much coordinated by the Dutch flower auction FloraHolland. They have installed FH-services in Kenya who act as consolidator of the packed roses and as declarant. They are considering to continuously inform customs on the involved stakeholders, by means of optional filing. Ontology for supply chain

The investigation of this trade lane has shown the diversity not only of stakeholders but also of forms and data elements used in general. Border control services are demanding standard declarations and forms. The phyto-sanitary controls use separate forms than customs authorities. The controls also comprise of checking both type of forms on noted total quantities and weights. These kinds of controls seem to check the separate administrations more than the real content. On the other hand the plurality of forms and data elements causes simple checks to be difficult anyway. It proved quite a puzzle to map and match the data elements of all different forms.

Figure below shows some investigation results concerning the naming of the transported goods: roses. It shows that the trade documents in the supply chain use many different names.

page 52

Page 53: Discovering TTL using LD - Rolf Nijenhuis

DATA TYPES SELLER AGENTKENYA HEALTH CUSTOMS TRANSPORT Agents

BELGIUMHEALTH CUSTOMS AGENT

HOLLANDCUSTOMS BUYER

Roseseg. Athena 'mixed roses' rose roses fresh flowers

boxes of mixedvarieties of fresh rose cut flowers rozen 060311 06031100 fresh fowers 06031100

eg. 5088 Athena

In this paper we have stressed the importance of ontologies or vocabularies in order to define the aspects of the supply chain and customs domains in view of using linked data. The above figure makes clear that a unique definition of ‘roses’ and the appliance of it would help the supply chain processes in general. This really applies to the hundreds of data elements involved in this case study chain of roses. These findings of differently described elements on various forms really urge to define a general setting of semantics. This paper focusses on the visibility of stakeholders and linked data. But the creation of a main ontology of the supply chain and customs domain would be a great start for standardisations anyhow.

Degrees of trust in a trade lane

Next figure 24 once more shows the stakeholders in the trade lane of roses based on figure 23 above. Its purpose is to show an evaluation towards discovering (trusted) trade lanes by customs. Note that this is no reality but is meant to speculate on a direction to take. For this we have applied the trusted trade lane scenario of ‘pattern recognition’ as opposed to the formal certification ‘meeting room’ scenario (see chapter 2.4).

In period 1, at first, only 4 stakeholders are known by declaration of the ENS. We consider for example some known (green; AEO certified) and unknown traders (blue). Blue traders are in fact known by name only, but unknown to customs as a business. Note that the white stakeholders are really unknown because they are not visible on the ENS. It is clear that the data on the ENS only, cannot lead to awareness of a trusted trade lane.

Period 2 depicts the situation where customs is informed of the stakeholders e.g. by optional filing. The next situation (period 3) shows that customs has obtained information to consider many stakeholders trusted. This information may come from customs IT-systems but also from outside via linked data. The EU uses the AEO certification, but other nations also use certifications, like the US (C-TPAT), Japan and Singapore. Another way to grant stakeholders a level of trust is when they are represented by a trusted agent or partner. The level of trust may grow when more data becomes available (period 4). But a trust level from the past may not be used as a constant indicator. The supply chains are changing and stakeholders may change their processes. This is illustrated by period 5 where the party FF_A has been noted with a lower level of trust. This is to emphasize that also trusted trade lanes should continuously be monitored. The main colours of blue, green and yellow, as are used in the Dutch enforcement vision, may be extended with many extra shades of yellow. These should indicate a scaling used in creating adequate insights in types of and applied trust for trade lanes.

page 53

Page 54: Discovering TTL using LD - Rolf Nijenhuis

ENS Opt.filing + dataperiod 1 period 2 period 3 period 4 period 5

Seller 1 Grower of roses Grower_A 50%Seller 2 Grower_B (more) 20%Agent consolidator Exporter FH services 90%Agent mar. transport 1 Carrier Mar_Agent_A Agent mar. Transport 2 Mar_Agent_B Road-carrier Kenya Road-carrier_XSea-carrier MaerskFreightforwarder Notify party FF_ARoad-carrier EU Road_Carrier_A Agent Customs Cust_Agent_A Adressee Sub-Consignee Flora Rijnsburg Consignee Importer FloraHollandBuyer Buyer_A

Buyer_B

Degree of trusted trade lane 0% 30% 60% 90% 70%

Figure 24 Discovering the trusted trade lane

By opening and sharing the business data involved, trade lane partners may strengthen their collective competences and ensure compliance. In order to be able to determine a level of trust, some general trade lane properties will be necessary to determine as indicators. Like for the roses trade lane customs must indicate an average number of stakeholders expected to be involved, in order to create a measure of completeness.

The gaining of insights in trade lane flavours will be facilitated by G2B discussions when following the ‘meeting room’ scenario. But the monitoring function and pattern recognitions will be necessary anyhow. Trade lanes will have to prove their trustworthiness somehow. And the best base for trust is the evidence by behaviour, which is shown by the data of transactions. In the ‘meeting room’ scenario customs will try to gain trust by evidence of attitude or intentions and by evidence of identity and values. Such evidence is generally wanted beforehand for certification purposes. But the evidence by behaviour in the form of automated monitoring will be most satisfying and reliable.

Next figure gives an overview of the process of determining the level of trust that may be assigned to a trade lane shipment.

Figure 25 Determine the degree of trust for trade lane

The actual shipment at hand is known by ENS-declaration and extra data are added by linked data from the web. In determining the degree of trust that customs may apply to the trade lane involved, comparisons are made to prior comparative trades. For this customs will need

page 54

Page 55: Discovering TTL using LD - Rolf Nijenhuis

historic data, to which the actual data will be added also. This trade lane history may be enriched with result information that customs authorities have gathered on control events that related to the stakeholders in this trade lane. If prior controls related to for instance import, export or transit activities of the same stakeholders never resulted in some discrepancy then this will add to the level of trust.

Customs will also try to recognize and build a registration of the type and the measure of awareness of trade lanes, by using pattern recognitions, and more indicators to discover. Trade lanes may be categorized in general but may also assigned a unique identifier like a name. A resulting indicator for the level trust for the shipment will be presented on the customs dashboard.

page 55

Page 56: Discovering TTL using LD - Rolf Nijenhuis

Feasibility of customs dashboards requirements

We will now come back to the requirements we noted in paragraph 5.2. and discuss their feasibilities from the viewpoint of the case study.

Availability and timeliness. Much depends on the moment of optional filings performed. Customs will limit the input time for its process of determination of degree of trust, which must be performed 72 hrs. before expected time of arrival. This limit is determined by regulation and by alignment to other customs processes. We may regard this limitation as part of the GOWA by which closed world arguments are explicitly stated in the open world approach.

Performance. We expect that the capturing of linked data may be executed by ‘on-the-fly dereferencing’ of URL’s. The links to data and even the server addresses will be provided by customs (GOWA) to directly apply. The knowledge processes of reasoning and pattern recognition will be more time consuming. But these processes may for greater parts be performed on a continuous base and are less depending on the requests by dashboard. Performance will be a concern but only building experience and using new technologies will provide the answer for this requirement. In first appliances it may be sufficient to only show the provided data without a degree of trust.

Adaptability. Open world methods and network models are in nature more easy to adapt and extend than closed relational systems (see paragraph 4.7).

Adoption rate. We may expect that businesses will be more inclined to provide for extra information via linked data, than by traditional messaging. The latter is much more costly.

Jurisdiction. Our design is very much depending on data analytics. And in order to be able to execute it, customs must record the track records of trade lanes. They already have a history of declared data. But more data are needed to create the options for pattern recognition. If customs want to adopt the sketched design, then the jurisdiction to allow customs to store the relevant open data should be (made) clear beforehand.

5.4 Intermediate conclusion

The sketched architectures show that linked data may certainly help to improve interoperability between customs and businesses but also customs to customs. A simple design has shown the basics by which linked data may serve in knowledge building on trade lanes and particularly in the concept of trusted trade lanes. Such TTL may be defined and certified by agreement, or it may be recognized from its behaviour through transactional data and by data analytics.

TTL’s should constantly be monitored in order to show their trustworthiness by behaviour. So customs should anyhow make use of data analytics. For this they will need linked data which could be presented by optional filing. Optional filing may be supported on different levels, making use of data federation towards the RDF-standard.

We have proposed the GOWA as a controlled manner to adopt to open world practices like monotonic reasoning. Optional filing can serve as the linking pin for implementation of linked data within the traditionally closed and ‘costly-to-connect-to’ world of customs.

Our sub research question for this chapter is: what design could serve an implementation of Linked Data technology for EU-customs risk assessments and based on which requirements?The sketched designs are meant to serve as an orientation for implementing linked data, for which we made the GOWA contribution.

page 56

Page 57: Discovering TTL using LD - Rolf Nijenhuis

Customs is reluctant to share requirements for the confidential process of risk assessments. But we pinpointed some non-functional requirements for the customs dashboard which serves as the means of presenting linked data in our case study. We found them in general feasible to be implemented using linked data. Storing of externally required data will however be necessary, but presents juridical issues.

6. Evaluation

In order to evaluate the outcomes of this paper a questionnaire had been sent to 15 people who are involved and experienced in the area of supply chains and customs, either from scientific, IT, business or customs perspective. Most of whom were connected to the CORE project. This resulted in 11 questionnaire results, based on preliminary 0.8-version of this paper. The questionnaire is explorative to check the degree in which the respondents agree with the conclusions of this paper that it would be beneficial for customs risk management to make use of linked data technologies. These were listed in the form of 11 theses as are listed in Annex 6. So in total 11 times 11 answers have been provided, in the form of a choice on the following options: agree, could agree, no opinion, disagree, strongly disagree and don’tknow. An overview of the posed theses is:

customs risk management should use data available by the web next to declared data, supply chains should make use of ontologies, customs should make use of semantics and reasoning, linked data would serve piggy backing, the GOWA would be a good principle to start with open world practices, together with optional filing.

They all relate to our main research question of how linked data could be used for more visibility of supply chains in general.The general response is listed in the figure below, covering all theses except 9 and 10. The latter required more inside knowledge and were sometimes not answered.

agree

could agree

disagree

don't know

0 10 20 30 40 50 60

Figure 26 General result questionnaires on agreement to using linked data in customs risk management

The general feeling seems that people support the ideas behind linked data and ontologies. Or at least they don’t deny them. In the questionnaire we did not address the hinders in starting up with the new technologies and techniques. The disagreements concentrated on the theses about the choice for an ontology and the question if customs would be the right party for linked-data initiatives.

CommentsSome interesting remarks had been added as a comment. For instance on the appliance of ontologies, the following remarks were made. ‘The supply chain parties seem not to feel the

page 57

Page 58: Discovering TTL using LD - Rolf Nijenhuis

need for ontologies; if this need will not come and policy makers will not enforce the use of ontologies, then the possible interoperability will not be realized’. And ‘it will be difficult to agree on ontologies and this will form delays’.

We have addressed the challenging problem of where and when to start with linked data developments. This issue relates to questionnaire thesis nr. 8: ‘customs is the right party to take linked-data initiatives’. We found that the respondents were less aligned in their answers on this one (see next figure 27) resulting in an inconclusive result. This might be influenced by the vagueness of the description ‘initiatives’. Respondents may have had different aspects in mind, like who will be paying for this, does this imply setting up a new development team or who will be responsible for defining the ontologies. But it may also underpin the absence of concrete starting up linked data initiatives so far. In general we have observed within CORE that Dutch customs more or less is taking a wait and see attitude, leaving the dashboard initiatives to the trading and supporting parties.

agree

could agree

disagree

don't know

0 0.5 1 1.5 2 2.5 3 3.5

Figure 27 Thesis: customs should take initiatives towards linked data

Interestingly some comments indicated that the application of GOWA will be only suitable for customs risk management on safety and security matters. They suggested that risk assessments on regular declarations other than pre-load summaries, for instance for transit, should still be based on the closed data of the declarations themselves. These assessments are not focussed on safety and security but mostly on fiscal aspects. They are guided by strict rules and regulations and the measures against violations are also predefined in regulations. In agreement with these comments we have afterwards more explicitly stated the relevance of the OWA for the knowledge based risk management processes. Other customs processes are well of by using the traditional closed CWA reasoning.

In general we find that the alignment to our conclusions is remarkable. Even more so considering the relatively few activities that are, to my knowledge, concerned with this subject within CORE and the respondents being involved in CORE.

page 58

Page 59: Discovering TTL using LD - Rolf Nijenhuis

7. Conclusion

Maritime global supply chains consist of a complex mix of many stakeholders. They form a black box for customs authorities who must watch for safety and security. In this paper we have explored if and how this customs risk management process may be supported by using linked data. We formulated our main research question: ‘how can linked data technology contribute to improving the visibility of supply chain stakeholders for customs risk assessments on goods entering the EU’.

We conclude that linked data makes it relatively easy for stakeholders to inform customs and offers low level thresholds for communication. Data may be supplied in various formats (par. 5.2) and without installing specific messaging software. Linked data offers open standards to model the domain (ontology), to implement its data (RDF) and to query the linked data (SPARQL) (chapter 4; par. 5.2 and 5.3). In this way it provides a solid basis for transparent standardization in IT and communication which may improve the desired interoperability. It offers scalability by allowing stakeholders to start on a small scale and incrementally expand, already using the open standards (par. 4.4; 5.2). Furthermore it may contribute in creating the awareness for and identification of Trusted Trade Lanes and their levels of trust (par. 5.3), by offering techniques to infer data and knowledge. And lastly, by using linked data, customs authorities may adapt to and acquire knowledge on the appliance of open world practices by which it is likely to improve its risk management, guided by the GOWA principle (5.1). In short linked data may provide for easy access for stakeholders to develop and improve interoperability. It may improve the visibility of supply chains for customs risk management in a stepwise manner, providing notions to realize the concept of trusted trade lanes.

After concluding that linked data may very well contribute in improving visibility and interoperability for customs risk management, we will now address the extra question posed in paragraph 2.5: what are the main differences considering the desired (business to customs) interoperability and the trustworthiness and reliability of supplied data, in applying a closed world versus an open world approach?

InteroperabilityEU-Customs is in the process of defining the new requirements for filing the ENS and still follow the traditional closed world approach, also in the proposed multiple filing option. And also many new initiatives in the CORE-project demonstrating the data pipeline use pre-defined interaction via interfaces and B2G communication by means of standard messaging, like the recent UK-development of David Hesketh addressing four Waypoints within the supply chain to push data by messaging (Hesketh, 2015). The experiences are that the participation in such closed world systems is rather complex and costly (par. 2.2). If supply chain stakeholders will fully adapt to these procedures and will consistently provide for all requested data in the predefined formats and semantics that EU-Customs have defined, then a fine level of interoperability will of course be realized. But in practice it will remain questionable if stakeholders will (be able to) cooperate, and if the multiple filing will be enforced. The improvements in interoperability are therefore still uncertain, mainly because of the involved costs to participate.

Our case study design (fig. 18) shows that the open world approach using linked data allows for flexible adoption of extra filings. Stakeholders should only provide for data that are to be collected by customs, be it on their own system or on that of a brokering agent. Data may be provided in various formats and volumes. The application of a shared ontology would support that data will be in the right format and with the right meaning. Stakeholders will, in the ideal situation, provide for data in their own triple-stores. In such a way customs would be provided with access to complete business administrations allowing for piggy backing. Such

page 59

Page 60: Discovering TTL using LD - Rolf Nijenhuis

set up would of course require new IT systems and extra costs. But it will only be realized when businesses will find this profitable in their supply chain. We found that Optional filing could be realized using linked data technology, thus improving interoperability in a rather simple and cost neutral way.

ReliabilityTraditional B2G messaging procedures implicitly guarantees that customs know the party that sends the data and may regard the received data as reliable. Identification, Authorization and Access control (IAA) are implicitly taken care of by following the rules defined in MIG’s (Message Implementation Guides).

In using linked data, customs must reconsider to what extent they want to apply IAA and how to implement it. This is an important issue which needs further research. One option is to check the identity of parties that make use of the URI-link to linked data. In this way only known and trusted parties will get access and the awareness of such a check may give the linking party, like customs, the notion that some security measures have been taken. Customs may also assign some level of reliability to data by reason that it contains specific locally provided information like an MRN. And if data is provided via known customs brokers or Business Community Systems then this will increase reliability of course. In general customs must adapt to new and varying methods to deal with levels of uncertainty. Like dealing with incomplete data they will be confronted with different levels of reliability. This will result in the levels of trustworthiness as we discussed in our design towards trusted trade lanes (par. 5.3). Directly linking to source systems may increase reliability because data will always be actual in time and not replicated.

Addressing the question on differences concerning interoperability and reliability using an open or a closed world approach, we may conclude the following. The open approach of using linked data, focussing on risk management, offers more flexibility in stepwise improving interoperability. It allows for extra data filings without obliging parties to high investments. On the other hand, when using linked data the reliability of data will reduce in general and customs will have to find new ways of IAA and learn to deal with uncertainties. The GOWA concept provides guidelines for this. It is good to realize that the concept of reliability is not an exact matter. Only relying on the ENS-data as lodged by the carrier, which in themselves should be very reliable, may result in less reliability and poorer quality on the whole than when extra data filings are to be used.

This research has been performed by investigating previous and still running supply chain research, by interviewing experts who are involved in corresponding research, by analysing the problems of Customs risk assessments of EU entering goods and by taking part in EU research which resulted in our case study. We have investigated the UCC-drafts and compared Multiple filing to Optional filing. Semantic web and linked data have been investigated by desk research studies and practical usage of the tools Topbraid Composer, AllegroGraph and Gruff.

Considerable research has been and still is being performed with the aim of creating more visibility and improved interoperability. Some state of the art concepts have evolved. The most prominent one is the data pipeline. It has gained major attention in projects but a generic solution is still to be presented. Part of this concept is the concept of data pull.

The research on linked data in global supply chains is still limited. It has been suggested as a method to capture data. And, only recently, as a solution to gain supply chain insights on high (EU) level using graphs.

We have concluded that linked data is really equal to the semantic web and is built on three foundations: RDF (modelling framework), SPARQL (query language) and OWL (ontology domain modelling).

Linked data adheres to the OWA and applies an open approach, which is natural for systems connected to the web. This opposes the traditional closed systems approach (CWA) which is used by customs authorities. Some problems, processes and systems are closed of

page 60

Page 61: Discovering TTL using LD - Rolf Nijenhuis

nature and need a closed system approach, others require openness. Knowledge management processes may be best supported by an open approach, using monotonic logic. We found that customs risk management for safety and security survey is best supported by an open approach, just like the human ‘trained eye’ of customs needs an open view.

An open approach using linked data may incrementally grow. It supports ease of change, mapping of different sourcing schema’s and gradually implementing additions like vocabularies of WCO Data model standards. It will need implementation of new tooling, which are of open standard. But it may fulfil the promises of the data pipe line in interoperability and visibility, without implementing some central IT features.

We presented some abstract design for applying linked data in the risk management process of customs, focussed on opening the stakeholders black box and getting to know the parties in the trade lane. For this we contributed the Guided-OWA concept to adopt OWA into the closed world of ENS declaration.

The concept of Trusted Trade Lane is coined by Dutch customs and still an open concept. We find this an intriguing concept which really fits into an open world approach. The Customs risk management process should not only be able to recognize the trade lane of a shipment but also to interpret how trustworthy the trade lane is. A TTL may be defined in different shade of yellow. Linked data plays an important role in this. Data analytics are to be used to detect trade lanes but also to monitor their transactions and detect levels of trustworthiness.

A challenge will be to shift towards open thinking and adopt open world practices and monotonic reasoning into the now closed systems. The method of Optional Filing in combination with GOWA may serve as a start-up in applying semantic web principles. We have proposed that Customs Authorities could start with adopting linked data, because they are the party that benefits most of data pipelines principles. Customs could even apply linked data and OWA neutral to the question of linked data sources. An open world assumption merely asserts that we never have all necessary information and lacking that information does not itself lead to any conclusions.

In general we conclude that linked data is very promising to improve visibility and interoperability, not only for customs but for all supply chain stakeholders. If they will be willing to share data and opt for visibility of the chain as a whole, then linked data could fulfil the promises of the data pipe line (par. 5.2).

This research and its design are not as detailed so that they directly conclude on a method or solution. But we find that they show a direction which has not been pointed out so clearly so far. The Trusted Trade Lane is a very intriguing concept. An open world approach using linked data may very well help its implementation.

We add some more general conclusions.- Linked data offers the opportunity to agree on semantics of the supply chain and customs

domain by (re)using ontologies. It allows to bridge non IT related boundaries like culture, languages and specific interests of certain stakeholders (chapter 3).

- Linked data is suited to implement the Optional Filing method. Implementation may be supported by the GOWA (chapter 5).

- Linked data applications may provide for lower costs in IT development, IT maintenance and data communication (par. 3.6, 4.7).

Linked data is an exciting development which is very likely to expand and may change the world. It offers capabilities to really open up supply chains and even to realize data pipelines implicitly. But more than that. It may solve the problems in communication between EU member states, and eventually even between the EU and other continents bridging semantic gaps.

page 61

Page 62: Discovering TTL using LD - Rolf Nijenhuis

7.1. Societal and scientific relevance

The scientific relevance of this paper is that it gives an overview of a wide problem area and is able to pinpoint a specific problem solution. It evaluates some main IT-findings of EU-project researches and puts them in a specific perspective. The results may create a new attention for using linked data and an clearer awareness that it may be able to solve problems in interoperability. This paper gives some direction for further research of the concept of Trusted Trade Lane. The research makes a statement that authorities and more specific customs authorities could the most designated party to actuate RDF developments and knowledge building towards open world approaches.

This relevance closely relates to the societal relevance. An improved interoperability and visibility in supply chains by customs will allow for less delay and easier growth in international trade and will help in getting products to consumers in a faster and cheaper way.The societal problem to solve is to increase the amount and quality of data that customs can use to perform their risk assessments. This is in fact to open up the black boxes of the supply chains and be able to discover TTL’s. The societal relevance is that the parties involved in global supply chains will eventually realize that the more they will be compliant and even furnishing for information via linked data, the better they will be facilitated by border inspection agencies (responsive regulation).

Furthermore this research may help in the actual decision making of the European Committee addressing the data quality of the ENS. They could observe the option to change its way of thinking from a traditional closed world manner to a more open and qualitative approach.

7.2. Limitations

This research tries to cover a wide domain consisting of customs procedures, customs legislation, prior and ongoing research on global supply chains, IT and semantic web. Given the available time for research it was limited to main research activities in order to get an overview. For example, the semantic web developments and reasoning based on formal logic, evolving form artificial intelligence research, could have formed a topic by themselves.The designed architectures and ontologies are meant to give an impression of the possibilities of web technologies. They have not been validated or checked against other designs or practices. For this reason the evaluation has been based on gathering opinions and views on the research and its subject.

We should note that we took an open positive attitude to the options of linked data. We intentionally disregarded possible drawbacks like the unwillingness of stakeholders to use it, issues on security and authorisation of commercial data and data quality of open linked data.

page 62

Page 63: Discovering TTL using LD - Rolf Nijenhuis

7.3. Further research

The subject of this research is also part of the CORE-project researches. Which project is still running at the moment of this writing. As we have concluded we find it questionable if and to which extent semantic web technologies will be applied within CORE.

The primitive design models of this research should be further explored and detailed. General important issues which we did not regard yet lie in the areas of Jurisprudence, IT performance and data quality of linked data.

We only superficially discussed authentication and authorisation issues like IAA above. Customs should define its requirements for this and in our opinion must learn to deal with uncertainty of quality of data and levels of trust. This is still an open area to discover.

We noted that Dutch Customs are unwilling to store information which was not received by (legally bound) declarations. This would prevent them, in our opinion, to build knowledge on trusted trade lanes, for example. The latest developments are that customs are considering this issue. It would maybe be an option to anonymize these data and to catch their essence in separate data-patterns. Knowledge management within customs authorities should explore these legal implications further.

This paper does not cover research on technical tooling. Although the performance and also the search methods like data crawling are very important when starting to make use of linked data. Furthermore we only touched on elements of data quality. Although our starting point was the low data quality of ENS as stated by the European Commission, we have focussed on the relevance of data and zoomed in on the invisibility of stakeholders. The quality of linked data should be analysed and weighed and data should be filtered for its use in the risk management process.

A very important change will be from closed to more open world systems. Further research on implications in changing modelling, reasoning and database techniques and technologies are evidently necessary. This may be guided by methods of a closure of the open world and may further explore the proposed GOWA assumption.

The concept of TTL is also still under open research. We believe that this concept will be able to provide the basic structure for risk management that customs is looking for and that data analytics will be necessary to build it.

But perhaps the most important aspect in adopting linked data technologies is the motivation and readiness of supply chain stakeholders to take such initiatives. We concluded that linked data may fulfil the data pipeline promises. But these will only be reached if the main parties will view the potential of linked data and the feasibility of implementations, having overcome hesitations regarding privacy and sharing of data. Further research of this and at the same time of the legal aspects concerning the usage of linked business data by border inspection agencies is necessary. Some other issues for further research that we have touched:

- To what extent may linked data be able to act as the envisaged Data Pipeline?- How to design ontologies based on or reusing the WCO-data model?- How to combine bottom-up approach of ontologies with a core customs domain ontology?- How to apply unique id’s (and URL’s) to supply chain ‘things’.

page 63

Page 64: Discovering TTL using LD - Rolf Nijenhuis

7.4. Reflection

In the wide area of our master course and with the many idea’s I had for my master thesis I experienced difficulties in finding a concrete subject. And after determining that it should involve the customs problems with the ENS but also Linked Data, it was hard to find an explicit point of attention. And maybe still, after finishing this thesis, I have the feeling it should be more detailed. But fortunately I could join the CORE project and make a combination of the Optional filing, linked data and Trusted Trade Lane. It was instructive to be part of this project, especially involving in the case study of a specific trade lane, although the project did not arrive at a definite stage with results yet.

But I also think now that I made a contribution. Many people I interviewed were less concerned with linked data or had never heard of it. In fact I found it sometimes hard to explain what my topic was about. The most impressive comment in favour of the semantic web I got when interviewing Laura Daniele (TNO). She told me that: ‘when problems with traditional closed IT-systems will grow too big, then the semantic web will automatically be accepted as the only solution’.The statement of this research is clearly that the semantic web is here to stay. I hope to have created some awareness for this. We could start with working on the shift towards open systems. The only question is when it will be really used to its capacities.

7.5. Acknowledgments

Firstly I wish to thank my supervisors for their advice and the opportunity to join the CORE-project. My thanks go to the colleagues of various partners in this project, mostly to Dutch Customs but also Maersk, TNO and FloraHolland. Their comments and recommendations made it possible to evaluate on my research.But I want to thank specially Marcel Mackelenbergh for sharing his enthusiasm for linked data and for providing me access to the triple-store of DTCA, and Thomas Jensen who supported me morally in difficult times by assuring me that I was on the right track.

page 64

Page 65: Discovering TTL using LD - Rolf Nijenhuis

8. Annexes8.1 Annex 1 Ontology example

Explanation on annex:# baseURI: http://DutchCustoms theURL reference to

the ontology(example)

@prefix : <http://DutchCustoms#> . Prefixes for abbreviation@prefix owl: <http://www.w3.org/2002/07/owl#> . referring to re-used@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . ontologies.@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

The meta-data<http://DutchCustoms> rdf:type owl:Ontology ; owl:versionInfo "Created with TopBraid Composer"^^xsd:string ;. :Stakeholder_via_Customs The subject class rdf:type owl:Class ; rdfs:subClassOf owl:Thing ;.:Reported_ID The object class rdf:type owl:Class ; rdfs:subClassOf owl:Thing ;.:has_reported_id The relating property rdf:type owl:DatatypeProperty ; from subject to object rdfs:domain :Stakeholder_via_Customs ; rdfs:range :Reported_ID ; rdfs:range xsd:integer ;

. The actual data<http://DutchCustoms#111> (of ID’s) rdf:type :Reported_ID ;.<http://DutchCustoms#222> rdf:type :Reported_ID ;.<http://DutchCustoms#333> rdf:type :Reported_ID ;.:FloraHolland (of stakeholders rdf:type :Stakeholder_via_Customs ; and properties) :has_reported_id <http://DutchCustoms#111> ;.:FF_A rdf:type :Stakeholder_via_Customs ; :has_reported_id <http://DutchCustoms#222> ;.:Grower_A rdf:type :Stakeholder_via_Customs ; :has_reported_id <http://DutchCustoms#333> ;.

page 65

Page 66: Discovering TTL using LD - Rolf Nijenhuis

8.2 Annex 2 Ontology optional filing

# baseURI: http://trustedtrader

@prefix : <http://trustedtrader#> .@prefix owl: <http://www.w3.org/2002/07/owl#> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

<http://trustedtrader> rdf:type owl:Ontology ; owl:versionInfo "Created with TopBraid Composer"^^xsd:string ;.:Bill_of_lading rdf:type owl:Class ; rdfs:subClassOf owl:Thing ;.:Container rdf:type owl:Class ; rdfs:subClassOf owl:Thing ;.:Shipment_MRN rdf:type owl:Class ; rdfs:subClassOf owl:Thing ; rdfs:comment "specific meaning added in by re-using vocabulary" :meaning "extra meaning also definable in specific format, like next sentence" :meaning in Japanese "xxx".:Stakeholder rdf:type owl:Class ; rdfs:subClassOf owl:Thing ;.:Trade_lane rdf:type owl:Class ; rdfs:subClassOf owl:Thing ;.:Trusted_trader rdf:type owl:Class ; rdfs:subClassOf owl:Thing ;.:embodies rdf:type owl:FunctionalProperty ; rdfs:domain :Stakeholder ; rdfs:range :Trusted_trader ;.:explanation_comment rdf:type owl:FunctionalProperty ; rdfs:range xsd:string ;.:has_role rdf:type owl:FunctionalProperty ; rdfs:domain :Stakeholder ; rdfs:range [ rdf:type rdfs:Datatype ; owl:oneOf ( "carrier"^^xsd:string "customs agent"^^xsd:string "carrier agent"^^xsd:string "road transporter"^^xsd:string "freight forwarder"^^xsd:string "consolidator"^^xsd:string "seller"^^xsd:string "buyer"^^xsd:string "consignor"^^xsd:string "consignee"^^xsd:string

page 66

Page 67: Discovering TTL using LD - Rolf Nijenhuis

) ; ] ;.:is_optional_filer rdf:type owl:FunctionalProperty ; rdfs:domain :Stakeholder ; rdfs:range xsd:boolean ;.:is_trusted rdf:type owl:FunctionalProperty ; rdfs:domain :Trade_lane ; rdfs:range xsd:boolean ;.:registration_source rdf:type owl:FunctionalProperty ; rdfs:domain :Shipment_MRN ; rdfs:range :Stakeholder ; rdfs:range [ rdf:type rdfs:Datatype ; owl:oneOf ( "by ENS"^^xsd:string "by optional filing"^^xsd:string ) ; ] ;.:relates_to rdf:type owl:FunctionalProperty ; rdfs:domain :Bill_of_lading ; rdfs:domain :Container ; rdfs:range :Shipment_MRN ;.:reveals rdf:type owl:FunctionalProperty ; rdfs:domain :Shipment_MRN ; rdfs:range :Trade_lane ;.

page 67

Page 68: Discovering TTL using LD - Rolf Nijenhuis

8.3 Annex 3 Interviewed

Bram Klievink Researcher and coordinator EU-Projects (TNO)Haiko van der Voort Assistent Professor TU Delft in the role of lecturer for the

Master Customs & Supply Chain ComplianceLaura Daniele Researcher Semantic Web (TNO)Lieke Verhelst Linked Data Consultant Maarten Veltman Intelligence officer (DCA)Marcel van Mackelenbergh Linked Data Consultant (DTO)Roel van ‘t Veld Policy officer EU Customs risk management (DCA)Wim de Viet Maritime risk management (DCA)Wim Visscher Master lecturer (Customs advisor; DCA)

As CORE project partners (implicitly):

Eric Geerts Customs Software Consultant (Descartes)Frank Heijmann Senior Customs Consultant and Advisor (DCA)Harrie Bastiaansen Researcher (TNO)Roel Huiden Coordinator (FloraHolland)Thomas Jensen Shipping expert (Maersk)

page 68

Page 69: Discovering TTL using LD - Rolf Nijenhuis

8.4 Annex 4 Abbreviations

D(T)CA Dutch (Tax and) Customs AuthoritiesAPI Application Programming InterfaceBCS Business Community SystemB2B Business to BusinessB2G Business to GovernmentCCP Consignment Completion PointTNO Dutch general research organisationDTO Dutch Tax OfficeIAA Identity Authorization and AccessIOS Inter organisational information systemIRI International Resource IdentifierLDBC Linked Data Benchmark CouncilLOD Linked Open DataOGD Open Government DataPCS Port Community SystemRDFS RDF SchemaRDF Resource Description FrameworkSWRL Semantic Web Rule LanguageSKOS Simple Knowledge Organization SystemSPIN SPARQL Inference NotationSPARQL SPARQL Protocol and RDW query languageSQL Standard Query Language (for relational databases)SSTL Safe and Secure Trade LaneTTL Trusted Trade LaneURL Unified Resource LocatorURI Uniform Resource IdentifierURN Uniform Resource NameUCC Union Customs CodeUDDI Universal Description, Discovery and Integration

UN/CEFACT United Nations Centre For trade facilitation and electronic business

UNTDED United Nations Trade Data Elements DirectoryOWL Web Ontology LanguageW3C World Wide Web Consortium

page 69

Page 70: Discovering TTL using LD - Rolf Nijenhuis

8.5 Annex 5 Data elements

Data elements IPCC DA UCC : ENScomplete set

DA UCC : ENS partial setBy carrier

DA UCC : ENSpartial set

By a person

SAFE Framework

USA10 +2

Goods items number X X XUnique Consignment Reference Number X X X X XTransport Document Number XConsignor (Master level transport contract) X X XConsignor (House level transport contract) X XPerson Lodging the Summary Declarations X XConsignee (Master level transport contract) X X X XConsignee (House level transport contract) X XDeclarant XDeclarant with identification number X XCarrier X X X X XNotify Party (Master level transport contract) X X XNotify Party (House level transport contract) X XIdentity of Active Means of Transport crossing the border X X X X XNationality of Active Means of Transport crossing the border X X XConveyance Reference Number X X X X XFirst Place/ Port of Arrival Code X X X X XDate and time of arrival at first place/port of arrival in Customs territory X X X XCountry(ies) of routing codes X X X XMode of transport at the border X X XCustoms office of exit XLocation of goods Place of loading X X X X Place of unloading code X X X Goods description Master Level X X X X Goods description House Level X X Type of packages (code) X X X Number of packages X X X Shipping marks X X X Equipment identification number, if containerised X X X X X Commodity code (4 digits) X Gross mass (kg) Master level X X X Gross mass (kg) House level X X UN Dangerous Goods code X X X Seal number X X X X Transport charges method of payment code X X X X Declaration date X X Signature/Authentication X X Other specific circumstance indicator X XSubsequent Customs Office(s) of entry code X X X Manufacturer name and address X Seller name and address X X X Container stuffing location X Consolidator name and address X Buyer name and address X X X Ship to party name and address X Importer of record number X Country of origin of the goods X Six digit classification (HS) code. X X XVessel stow plan XDaily messages changes containers status X

source: (Korol, Nijenhuis et al. 2015)

page 70

Page 71: Discovering TTL using LD - Rolf Nijenhuis

8.6 Annex 6 Questionnaire

Optional answers: A B C D E X

Agree Could agree No opinion Disagree StronglyDisagree

Don’t knowCannot answer

Theses:

-1- Customs risk management should use data available on the web instead of only declared data; therefore adjust to the open world assumption instead of the closed world assumption (par. 1.3 and 5.2).

-2- Ontologies (par. 4.4) and vocabularies could help very much to improve interoperability in the supply chains by creating agreed standards and semantics.

-3- Ontologies may help to bridge cultural and language gaps in global supply chains.

-4- The ‘data pipeline’-benefits for customs as piggy-backing could be achieved by using linked data (par. 5.3).

-5- A principle as ‘guided open world assumption’ may be a good start for working with linked data (par. 5.2).

-6- ‘Optional filing’ fits this principle well and could be implemented using linked data.

-7- Customs could make advantage of semantics and reasoning for example in determining on trusted traders and trusted trade lanes (par. 5.4)

-8- Customs are the right party to take initiatives in building knowledge to use linked data technologies.

-9- Two ontologies were sketched in par. 5.3, to be capable to register optional filings. Figure 20 (situation 1) is focussed on stakeholders knowledge only, (situation 2) is more detailed on contract data and by this in the mutual stakeholder relations by contracts. It is preferable to work on achieving situation 2 directly.Please give comment on models if you wish! Note: the second ontology (situation 2) has been removed in the last version of this this paper for better overview.

-10- The CORE-project is now about to deliver demonstrations on using semantic web / linked data technologies.

-11- The Living Labs are very much suited to develop further research on using semantic web technologies.

page 71

Page 72: Discovering TTL using LD - Rolf Nijenhuis

8.7 Annex 7 Web links

allegro; 2015 http://franz.com/agraph/allegrograph/artint; 2015 http://artint.info/html/ArtInt_129.htmlbergman; 2015 http://www.mkbergman.com/852/the-open-world-assumption-elephant-in-the-room/cambr; 2015 http://www.cambridgesemantics.com/semantic-university/getting-started-semanticscore; 2015 http://www.coreproject.eu/d&b; 2015 http://www.dnb.comdbp; 2015 http://en.wikipedia.org/wiki/DBpediadcore; 2015 http://dublincore.org/descartes; 2015 https://www.descartes.comdutchvision; 2015 https://www-

950.ibm.com/events/wwe/grp/grp009.nsf/vLookupPDFs/D1S6.1C_FrankHeijmann_Enforcement%20Vision%20EN/$file/D1S6.1C_FrankHeijmann_Enforcement%20Vision%20EN.pdf

euon; 2015 http://www.eudat.eufp7; 2015 http://ec.europa.eu/research/fp7Horrocks; 2015 http://lists.w3.org/Archives/Public/www-rdf-logic/2001Jul/0067.htmlisis; 2015 http://www.nbcnews.com/storyline/isis-terrorld.org; 2015 http://linkeddata.org/ldbc; 2015 http://www.ldbc.eu/lod2; 2015 http://cordis.europa.eu/project/rcn/95562_en.html and: www.lod2.eumanch; 2015 http://www.cs.man.ac.uk/~drummond/presentations/OWA.pdfneo4j; 2015 http://neo4j.com/rdamrules; 2015 http://www.rotterdamrules.com/conventionsem; 2015 http://www.semanticweb.orgspin; 2015 http://spinrdf.org/swrl; 2015 http://www.w3.org/Submission/SWRL/#1tbl; 2007 http://dig.csail.mit.edu/breadcrumbs/node/215tbl; 2015 http://www.w3.org/DesignIssues/tbraid; 2015 http://www.topquadrant.com/tools/modeling-topbraid-composer-standard-edition/ucc;2015 http://eur-lex.europa.eu/legal-content/EN/TXT/PDF/?uri=CELEX:32013R0952&amp;rid=1un-code; 2015 http://wiki.goodrelations-vocabulary.org/Documentation/UN/CEFACT_Common_Codesuniprot; 2015 http://www.uniprot.org/verhelst; 2015 www.linkeddatafactory.nlw3c; 2015 http://www.w3.org/2001/sw/wiki/Main_Pagew3c-rdf; 2015 http://www.w3.org/TR/2014/NOTE-rdf11-primer-20140225/wco; 2015 http://www.wcoomd.org/en/topics/facilitation/instrument-and-

tools/tools/pf_tools_datamodel.aspxwiki-kg; 2015 http://en.wikipedia.org/wiki/Knowledge_Graphwiki-ld; 2015 https://en.wikipedia.org/wiki/Linked_open_datawiki-logic; 2015 https://en.wikipedia.org/wiki/First-order_logicwiki-pubsub; 2015 http://en.wikipedia.org/wiki/Publish%E2%80%93subscribe_patternwiki-push; 2015 http://en.wikipedia.org/wiki/Push_technologywiki-sem; 2015 https://en.wikipedia.org/wiki/Formal_semantics_(logic)wiki-tr; 2015 http://en.wikipedia.org/wiki/Triplestore

page 72

Page 73: Discovering TTL using LD - Rolf Nijenhuis

9. ReferencesAllemang, D. and J. Hendler (2011). Semantic web for the working ontologist: effective modeling in RDFS and OWL, Elsevier.

Antoniou, G. and F. Van Harmelen (2004). A semantic web primer, MIT press.

Association, I. S. (1990). "Standard glossary of software engineering terminology." lEEE Std: 610.612-1990.

Bauer, F. and M. Kaltenböck (2011). "Linked open data: The essentials." Edition mono/monochrom, Vienna.

Commission, E. (2012). "Communication (2012-793) on Customs Risk Management and Security of the Supply Chain." EU Communication(2012-793).

Commission, E. (2014). "Title I General Provisions." Delegated and Implementing Acts.

Commission, E. (2014-2). "Title IV Goods brought into the customs territory of the union." Delegated and Implementing Acts.

Commission, E. (2014-3). "Communication (2014/527) on the EU Strategy and Action Plan for customs risk management: Tackling risks, strengthening supply chain security and facilitating trade. ."

CORE, p. (2013). "Annex I - Description of Work." 126.

Daniele, L. and L. Ferreira Pires (2013). "An ontological approach to logistics."

DuCharme, B. (2013). Learning Sparql, " O'Reilly Media, Inc.".

Hanseth, O. and K. Lyytinen (2010). "Design theory for dynamic complexity in information infrastructures: the case of building internet." Journal of Information Technology 25(1): 1-19.

Heath, T. and C. Bizer (2011). "Linked data: Evolving the web into a global data space." Synthesis lectures on the semantic web: theory and technology 1(1): 1-136.

Hesketh, D. (2009). "Seamless electronic data and logistics pipelines shift focus from import declarations to start of commercial transaction." World Customs Journal 3(1): 27-32.

Hesketh, D. (2010). "Weaknesses in the supply chain: who packed the box." World Customs Journal 4(2): 3-20.

Hesketh, D. (2015). "Seamless integrated data pipelines" Background to the data pipeline concept and progress on the CORE Project, Work Package 10.

Hevner, A. R. (2007). "A three cycle view of design science research." Scandinavian journal of information systems 19(2): 4.

Hofman, W. (2011-2). Supply chain visibility with linked open data for supply chain risk analysis. Workshop on IT Innovations Enabling Seamless and Secure Supply Chains.

Hofman, W. (2015). Towards a Federated Infrastructure for the Global Data Pipeline. Open and Big Data Management and Innovation, Springer: 479-490.

Hofman, W. and H. Aldewereld (2011). "D3-6 Cassandra; IT Synthesis and integration." (D3.6).

Hofman, W. and H. Bastiaansen (2013). D3-1 Cassandra; Integration architecture - overview (restricted dissemination). FP7-SEC-2010-3.2-1.

page 73

Page 74: Discovering TTL using LD - Rolf Nijenhuis

Hohpe, G. and B. Woolf (2004). Enterprise integration patterns: Designing, building, and deploying messaging solutions, Addison-Wesley Professional.

Jensen, T., et al. (2014). Avocados crossing borders: the missing common information infrastructure for international trade. Proceedings of the 5th ACM international conference on Collaboration across boundaries: culture, distance & technology, ACM.

Jensen, T. and R. Vatrapu (2015). "Ships & Roses: a revelatory case study of affordances in international trade." (Twenty-third European Conference on Information Systems (ECIS), Munster, Germany, 2015): 1-18.

Karakostas, B. (2014). "A Knowledge Graph for supply chain ecosystems." CORE workpackage 8.

Karakostas, B. (2015). "CORE Ecosystem Architecture."

Karakostas, B. (2015). CORE Ecosystem Architecture.

Klievink, A., et al. (2012). "Enhancing Visibility in International Supply Chains: The Data Pipeline Concept." International Journal of Electronic Government Research, 8 (4), 2012.

Klievink, B. and I. Lucassen (2013). Facilitating adoption of international information infrastructures: a Living Labs approach. Electronic Government, Springer: 250-261.

Korol, L., et al. (2015). "Quality of data and risk management on safety and security by EU Customs." (Master Customs & Supply Chain Compliance; Delft University of Technology).

Loukakos, P. and R. Setchi (2010). Application of ontological engineering in customs domain. Knowledge-Based and Intelligent Information and Engineering Systems, Springer: 481-490.

Mendes, P. N., et al. (2012). Sieve: linked data quality assessment and fusion. Proceedings of the 2012 Joint EDBT/ICDT Workshops, ACM.

Ozturk, N., et al. (2014). "How reliable are the ENS data?" (Master Customs & Supply Chain Compliance; Delft University of Technology).

Patel-Schneider, P. F. and I. Horrocks (2007). "A comparison of two modelling paradigms in the Semantic Web." Web Semantics: Science, Services and Agents on the World Wide Web 5(4): 240-250.

Pokraev, S., et al. (2005). "Semantic and pragmatic interoperability: a model for understanding."

Russell, S. and P. Norvig (1995). "Artificial intelligence: a modern approach."

Staab, S. and R. Studer (2010). Handbook on ontologies, Springer Science & Business Media.

Tan, Y.-H., et al. (2010). Accelerating Global Supply Chains with IT-innovation: ITAIDE tools and methods, Springer Science & Business Media.

Veenstra, A., et al. (2013). Vermindering van regeldruk in de logistieke sector bij export en import, TNO.

Verschuren, P., et al. (2010). Designing a research project, Eleven International Publishing House.

page 74