ART 1 Metadata Categories for Supporting Concurrent

download ART 1 Metadata Categories for Supporting Concurrent

of 8

description

Datos

Transcript of ART 1 Metadata Categories for Supporting Concurrent

  • Metadata Categories for Supporting ConcurrentEngineering

    Juliane Blechinger, Frank Lauterwald, Richard LenzChair for Computer Science 6 (Data Management)

    University of Erlangen and NurembergErlangen, Germany

    {juliane.blechinger, frank.lauterwald, richard.lenz}@informatik.uni-erlangen.de

    AbstractConcurrent engineering is a keyword in todaysenterprises. Almost every enterprise parallelizes its engineeringprocesses to reach a higher efciency in designing their products.Unfortunately, the time- and cost-saving potential of concurrentengineering cannot be used to its full capacity. In fact, designproblems arise and lead to a lot of rework. As we have recognized,design problems always affect the underlying data. Thus, wrongdata is an indication of such problems. As a consequence,the improvement of the data quality should reduce the designproblems. Although the data quality-related research communityhas proposed various management approaches, these approachesare too generic and thus give little guidance about what to do ina given situation. The goal of our project, DQ-Step, is to developa solution that is both general enough to cover an entire domain,namely concurrent engineering, while remaining concrete enoughto give usable guidelines to enterprises to support their engineersand nally speed up their design. In our previous work, we havealready identied the major problems in concurrent engineering.In this paper, we discuss the metadata categories required to helpovercome these problems.

    Index Termsconcurrent engineering; metadata; meta model;data quality; adaptability; concrete approach for enterprises

    I. INTRODUCTION

    In the last decade, many enterprises switched from the com-mon sequential engineering of their products to the concurrentway of engineering. They parallelized their design steps withthe intention to speed up the whole engineering process.However, they forgot to build up an appropriate informationinfrastructure that is necessary to realize the intended time-and cost-savings. Without this information, the parallelized,decomposed, and iterating nature of concurrent engineeringleads to design problems and inconsistencies that are detectedvery lately and thus lead to huge rework cycles.To deal with these inconsistencies and other data qualityproblems, management approaches and meta models havebeen proposed in the data quality-related scientic literature.However, these approaches and models are very abstract andcan not serve as specic solutions in real projects. In ourproject, we strive for a solution that is as general as possible,while remaining able to give concrete advice to enterprises thatpractice concurrent engineering. We propose a metadata-basedapproach that enables domain experts to easily establish theinformation infrastructure that the engineers need to monitorand support their concurrent engineering processes. It is veryimportant to keep in mind that only engineers themselves

    have the necessary knowledge to solve their particular prob-lems. Thus, the built information infrastructure for supportingconcurrent engineering has to be highly usable according tothe principles of interaction design: The way an interfaceis designed can greatly affect how well people can perceive,attend, learn, and remember how to carry out their tasks. [1,p.104]. Ignoring those principles will lead to bad interfaceswhich the engineers tend to refuse. Of course, if the engineersdo not use an information system, the supporting effect of thesystem cannot be achieved.One of our main goals is generalizability. Thus, our solution isnot tied to certain existing information management systems(IMSs), which usually differ greatly between projects andenterprises. In addition, our metadata categories are generalenough to be easily adoptable to various projects in concurrentengineering domains. The contribution of this article is thedetailed description of these metadata categories that provideboth engineers and managers with the information they need toeffectively and efciently do their task at hand. Furthermore,we provide implementation issues to offer a concrete solutionthat can easily be adopted or at least tested by enterprises.The contribution of this article builds on our work in [2]where we have already provided a comprehensive problemclassication. We summarize the main points of this problemclassication in this article and present the metadata categoriesand implementation issues we have derived from it, recently.The remainder of this article is organized as follows: Section2 summarizes the related work. It rst provides the scienticdenition and special challenges of concurrent engineering.Second, it summarizes the existing management approachesand meta models related to data quality issues. Section 3presents our application domain, plant engineering, and sum-marizes the main points of our problem classication that wehave already presented in [2]. Finally, the general requirementsfor an approach aiming at supporting concurrent engineer-ing environments are deduced. In section 4, we describethe necessary metadata categories for supporting concurrentengineering. The necessity of the selected metadata categoriesis based on our problem classication and the ndings ofour comprehensive analyzing activities regarding the currentprocess, the IMS landscape, and the real data in the enterpriseof our project partner. Of course, during the implementationprocess, the selection and realization of the categories was

    2011 15th IEEE International Enterprise Distributed Object Computing Conference Workshops

    978-0-7695-4426-7/11 $26.00 2011 IEEEDOI 10.1109/EDOCW.2011.10

    26

  • reviewed several times with engineers and managers. Aftera discussion of our approach in section 5, we conclude andpresent some future work in section 6.

    II. RELATED WORK

    To place our approach into a proper context, this sectioncontains related work. To the best of our knowledge, thereis no directly comparable approach in the scientic literature.Thus, we looked into the topics that our approach includes:First, we present the basic concepts of concurrent engineering.Second, well-known management approaches for data qualityare shortly summarized. The section ends with existing metamodels for data quality in the scientic literature.

    A. Concurrent Engineering

    In todays enterprises, the sequential way of designingproducts is no longer state-of-the-art. A close-meshed, inter-leaved engineering concept is implemented instead in order toachieve cost and time savings. In the mid 90s, this observationcaught researchers attention and concurrent engineering wascomprehensively dened. One denition is given by [3] and[4]: Concurrent engineering implies that there are severalprocess steps that can proceed in parallel. However, many ofthem are interdependent, requiring designers to proceed withpartial information, with incomplete knowledge, and subjectiveinterpolations. To make parallel design of interdependentprocess steps possible, early preliminary upstream informationhas to be shared with downstream design stages. Each iterationin the design results in changes that must propagate throughall dependent design stages. Consequently, (iterating) rework,upstream and downstream, can be needed to reach a consistentdesign [5].Consequently, concurrent engineering has to deal with fourbasic aspects: parallelism, decomposition, iteration, and sta-bility. I.e., the whole design task is decomposed into severaldesign steps (or stages). These design steps are executedin parallel. The sharing of preliminary information leads toiterations when something is changed. To reach convergence,the iteration has to be stable. This means that the number ofnewly arising design problems has to be less than the numberof the solved ones.We have experienced that, although concurrent engineeringis common in todays enterprises, the engineers are mostlynot supported well enough regarding the four basic aspectsmentioned above. This leads not only to data quality prob-lems but also reduces the potential of concurrent engineeringregarding time and cost savings substantially. Additionally, tothe best of our knowledge, only few of the existing approachesin the scientic literature focused on or considered concurrentengineering as root cause for many engineering problems onthe surface. This motivated us to ll this gap and particularlyfocus on concurrent engineering in the remainder of this paper.

    B. Management Approaches for Data Quality

    In the domain of data quality, many approaches for manag-ing data quality were presented. However, these approaches are

    very theoretical and offer no direct practical implementationinstructions. Many researchers in the data quality domain focuson the denition and classication of data quality dimen-sions [6], [7]. These articles are the basis for managementapproaches like TDQM (Total Data Quality Management) [8],TIQM (Total Information Quality Management) [9], AIMQ(AIM Quality) [10] or the modeling approach IP-MAP (Infor-mation Product MAP) [11]. Due to the fact that data qualityis multidimensional and thus, all arising problems are veryapplication-specic, these management approaches are tooabstract for concrete application scenarios. Enterprises needadvice with practical implementation instructions that they caneasily apply to their domain.

    C. Meta Models

    Finally, we summarize existing meta models that are at leastsomehow related to data quality issues. Due to the fact thatmost of the meta models mentioned in this subsection have acompletely different focus than our approach, it is obviousthat these models do not consider concurrent engineeringscenarios. Thus, the following statements are in line with ourrequirements, but do not necessarily fault those meta models.Metadata-based data quality solutions are presented in [12]and [13]. The approach in [12] is very conceptual and doesnot offer concrete metadata classes or rule types. It alsodoes not concentrate on concrete metrics to measure dataquality problems which is a main topic in [13], instead. Themeta model in [13] focuses on SQL-based rules and offersa concrete metadata schema. However, both models disregardthe process and engineering context.Besides the data quality-specic meta models, there are alsometa models in the context of business processes. The businessprocess activity meta model in [14] classies business rulesinto global, structural, and activity rules. The resulting metamodel is separated into entities related to the business processand its behaviour and into entities that focus on necessarycharacteristics for the IMS implementation. Another metamodel for business rules is presented in [15]. This meta modelconsists of the following entity types: data model component,business rule, process, origin, information system component,actor, organizational unit, and software component. Conse-quently, this model is much more detailed and comprehensivethan the model in [14]. However, both models also disregardthe characteristics of concurrent engineering. Besides, theprocess context is not adequately represented.Finally, the enterprise architecture meta model in [16] aims atoffering a situation-based solution for IT/business alignment. Itconsists of entities for business processes, organizational units,and information systems. Regarding adaptability, informationsystem neutrality, and the goal of offering an applicablesolution, it is similar to our approach. However, in our point ofview, it is not detailed enough and has a different focus thanour meta model. So, it does not consider concurrent engineer-ing scenarios and offers no rule component for monitoringdata quality.

    27

  • !

    "!#

    $

    !

    "%#

    $&

    !

    "%#

    "!#

    &

    %

    "'#

    "(#

    )

    !

    "%#

    "!#

    $*

    &&

    "!#

    "%#

    Fig. 1. Entity-relationship diagram of the implemented metadata categories (attributes omitted)

    III. PROBLEM DEFINITION

    In this section, we rst characterize our application domain,namely plant engineering, that represents the real environmentfor our implementations. All of the ndings we present in thisarticle were gained via comprehensive analyzing activities andinterviews in the enterprise of our plant engineering-specicproject partner. The second subsection summarizes the prob-lem classication we have already presented comprehensivelyin [2]. The deduced general requirements for an approachaiming at supporting concurrent engineering environments arenally presented in the last subsection.

    A. Application Domain

    The application domain of our project partner that servesas the real environment for user interviews, data and processanalysis, and the implementation of our approach is plant engi-neering. Plant engineering can be attributed to the engineeringof huge physical objects that are only built on explicit orderand consist of many composite objects (or items). Accordingto their functionality and construction, these objects can beclassied into object types. Other domains with the samecharacteristics are for example ship building, astronautical

    engineering or the design and construction of any kind ofindustrial factories, e.g. for the chemical industry.Projects in plant engineering are huge and last several years.This makes it impossible to change the underlying IMSlandscape during a running project, if problems regarding dataquality, delays in time, or complaints from the engineers arise.Besides, even in engineering projects with shorter durations,the replacement of IMSs is often not up for discussion. Thishas two reasons: First, every major enterprise has an IMSlandscape consisting of many interdependent subsystems withvarious interfaces. This makes it hard to change the wholelandscape. Changing only one subsystem would not solve theproblems mentioned above, because data quality problems,user dissatisfaction as well as time delays result from thewhole landscape and not from only one subsystem. Second,changing the IMS landscape is a management decision andcan hardly be inuenced by the engineering department.These considerations lead to the fact that an approach forsupporting concurrent engineering has to be IMS-neutral andthus, needs to be placed on top of the IMS landscape. Thismakes it easy for various enterprises to adapt, or at least test,the approach at low costs.

    28

  • B. Problem Classication

    To base our approach on a comprehensive problemclassication, we have applied various analysis techniquesto our application domain. First, we conducted interviewswith representatives of different user groups. Engineers,administrators, and managers were involved. Second, wemade an electronic assessment survey to get a wide-spreadview. Third, we analyzed the current process, the IMSlandscape, and the real data in the enterprise of our projectpartner. Based on the results, we summarized the threefollowing basic problem classes: necessary informationunavailable to the engineers, problems of wrong data values,and necessary management information. In the following,we shortly summarize these problem classes which arecomprehensively described and exemplied in [2].

    1) Information Decits for Engineers: This problem classcontains information that is not at all - or only with alot of effort regarding searching and comparing activities -available to the engineers. However, this information is veryimportant for supporting engineers in concurrent engineeringenvironments; for example, the availability of the followinginformation types would preserve the engineers from wrongassumptions that lead to wrong data and nally to a lot ofrework:

    Responsibility: The engineer receives attributes from pre-vious design steps. If s/he has problems or questionsregarding these attributes, s/he needs to know the nameand contact details of the lling responsible persons to beable to immediately contact them without loosing time.

    Object Type Classication: If an engineer has to designan instance of a special object type, s/he exactly needsto know what attributes have to be lled when and bywhich role. This information helps the engineer in beingaware of what s/he has to ll mandatorily.

    Interaction Messages: The engineers need usable interac-tion possibilities between design steps and responsibili-ties. For example, they have to be notied if attributesof objects, which they are involved with, change. Addi-tionally, they need a mechanism to easily send changerequests to responsible persons, if attributes from previ-ous design steps rise conicts with their design.

    Obligation Levels: For the attributes the engineers re-ceive, they need to know their obligation level. Thiswould preserve the engineers from building their designon unreliable information.

    2) Wrong Data Values: This problem class contains typicaldata error types. Taxonomies of data error types are alreadycomprehensively provided in the scientic literature [17], [18].Additionally, denitions regarding the underlying data qualitydimensions are given in [19]. Thus, only the most commondata error types are shortly summarized:

    Incompleteness: Objects are specied insufciently, be-cause mandatory attributes are not lled.

    Incorrectness: Attribute values can be incorrect in various

    ways. For example, they can have the wrong format or bewrong regarding the allowed range or list of values. Ofcourse, one can say that these problems can be avoidedwith constraints, but in most application domains everyrule has exceptions. That is why rigid constraints arehardly implemented in IMSs. Thus, the engineers needa possibility to dene rules to be able to nd possiblemistakes. If these possible mistakes are really erroneousand need to be corrected, or if they are just exceptions,can only be judged by the responsible engineer. Besides,there are also types of incorrectness that can only befound via manual inspection or by means of validatedreference data.

    Inconsistency: In concurrent engineering environments,many types of inconsistencies can arise. Examples are:Different attribute values of replicated objects in differentIMSs, inconsistencies between adjacent objects, inconsis-tencies between objects and predened catalogues, andinconsistencies between attribute values of one objectwith conditional functional dependencies [20].

    3) Information Decits for Managers: This problem classcontains information about the overall project status. This in-formation is primarily needed by managers to estimate designdurations and reassign resources according to newly incurredbottlenecks. Thus, the managers need aggregated informationabout the number of completed and uncompleted objects perdesign step and object type with the name and contact detailsof responsible persons.

    C. Requirements

    To sum up, we identied the following requirements forsupporting concurrent engineering:

    The solution should be IMS-neutral and adaptable. Thisis necessary, because almost every enterprise has an ex-isting IMS landscape consisting of many interdependentsubsystems and the exchange of this IMS landscape isoften not up to discussion. Additionally, IMS changes inthe future can not be foreseen. Consequently, an IMS-neutral and adaptable solution would make it easy forvarious enterprises to adapt, or at least test, the solutionat low costs.

    In engineering enterprises, not the technical administra-tors but the engineers have comprehensive knowledgeto adequately dene domain-specic data quality rules.Additionally, only if the solution is intuitively usable,understandable, and motivating, the engineers will useit. Thus, important requirements from the domain ofinteraction design arise.

    The solution has to offer special information for engi-neers, aggregated information for managers, and a ruledenition possibility to nd wrong data values. Theinformation for the engineers consists of responsibilityinformation, an object type classication, interaction mes-sages, and obligation levels. The information for themanagers shows the overall project status with contactdetails of responsible persons. Finally, the engineers need

    29

  • a possibility to easily dene data quality rules to ndpossible mistakes regarding incomplete, inconsistent, orincorrect data values.

    Before we present our metadata categories in the next section,it has to be mentioned that state of the art tools in plantengineering do not adequately address these mentioned re-quirements. Additionally, we analyzed both commercial andresearch data quality tools in [2], but those tools concentrateon cleaning and proling activities and not on concurrentengineering. A comprehensive survey about data quality toolscan be found in [21].

    IV. METADATA CATEGORIES

    In this section, we present the necessary metadata categoriesfor supporting concurrent engineering. In fact, we identiedsix main categories. Every subsection describes one of thesesix categories. Due to the fact that practical implementa-tion instructions are mandatory for todays enterprises, weshow exemplarily how we have implemented the mentionedmetadata categories in our real project. The overall entity-relationship diagram of the metadata categories implementedin our prototype is presented in Figure 1. For lack of space,attributes are omitted in Figure 1, but - if important - arementioned in the following subsection. Besides, the entitytypes that we used to implement every category in our realproject are also shortly addressed.Additionally, metadata categories can be separated intoproject-specic and user-specic categories. Project-specicmetadata has to be dened by the administrators once at thebeginning of the project and stays static afterwards. However,user-specic metadata is dened by the engineers themselvesand changes during the project arbitrarily.

    A. Object Type Model

    The nal product in engineering consists of many compositeobjects. For example in plant engineering, an object is a pipe,a valve, or a pump. To design an object, many different engi-neering departments have to calculate and nally ll attributesof this object according to their role. Which attributes have tobe lled for an object is dened by its type, the object type.Object types can be classied by their functionality and theirconstruction. In fact, commonly many object types exist thatdiffer only in few constructional issues. So, it is reasonableto classify the object types hierarchically according to a treestructure. That way, the redundancy of describing many similarobject types is reduced signicantly. New object types caneasily be added to the tree structure. Additionally, the engineercan derive a special object type via its superclasses in the treestructure what makes it easy to understand the functionalityand construction issues behind the object type name.As an example, a small part of our concrete object typemodel for valves in plant engineering is given in Figure 2where the classication path to the existing object type Con-trolValvePlugEMSR is depicted. Via this classication path, theengineer gets the information that objects of this type have thecontrol function, two separate inputs and a plug moving device

    Fig. 2. Part of the Object Type Model for valves in plant engineering(attributes omitted)

    (see [22] for valve-specic explanations). Referring to Figure1, Figure 2 is a zoom into the Object Type Attribute entity typein Figure 1 in our concrete implementation; it demonstrateshow we implemented the object type model as a tree structure(more details on this tree structure can also be found in [2]).We stored this tree structure of object types in the entity typeObject Type Attribute. In detail, we rst built generalizationsof object types as exemplied in Figure 2. Then, the attributenames of the object types were reassigned to the adequatesuperclasses in the tree. Finally, the entity type Object TypeAttribute has to map this tree by means of at least the attributesattribute name, type (as assigned superclass), source as IMSwhere the attribute value is saved, and a Boolean mandatory?.This entity type is the central point of all other metadatacategories and thus has to be lled carefully. It belongs to thecategory of project-specic metadata and has to be denedonce at the beginning of the project.

    B. Design Step Model and Responsibility

    The attributes of an object are lled in several inter-dependent design steps and by various engineering roles.Consequently, if an already released attribute is changed ina previous design step, it has to be claried which of thesubsequent design steps are affected. In order to provide thisinformation to the engineers, a design step model has to bebuilt that consists of the denition of the design steps and theirorder. Additionally, the design step model extends the objecttype model with information on the attributes regarding whenthey have to be lled and by which role. Every attribute hasan engineering role that has the responsibility to ll it (ll-responsible role). All other roles have just the permission to

    30

  • read the concerned attribute.To provide the responsibility information regarding the nameand contact details of responsible persons to the engineers, alsometadata about the engineers themselves have to be stored.Consequently, every person has an engineering role. Basedon that, every person can mark him/herself in his/her roleas the ll-responsible person or just as interested person forspecic objects. Of course, only one person per role can bethe ll-responsible person of a specic object, whereas anynumber of persons can be just interested in the changes ofan specic object. The personal adoption of roles for specicobjects is necessary for notication purposes. This way it canbe easily determined, which persons have to be informed, ifan object in a previous design step has changed. Additionally,if an engineer in a subsequent design step conducts problemswith an received attribute from a previous design step, the ll-responsible person can be easily identied. This speeds up thecommunication and feedback loops.Referring to Figure 1, we implemented the design step modelwith the entity type Design Step and an n:m-relationship typehas next regarding the order. It is important to notice thatevery design step in the design step model has to have atleast either a previous or a next design step. So, it can beguaranteed that no design step is isolated. The responsibilityissues are implemented with the entity types Responsible Role,Person, and Person Instance Connection. The latter storesvia the attributes instance id, person id, and a Boolean ll-responsible? the personal adoption of roles for specic objects(or instances).The entity types Design Step, Responsible Role and the n:m-relationship type has next belong to the category of project-specic metadata and have to be dened once at the beginningof the project. However, the entity types Person and PersonInstance Connection belong to the category of user-specicmetadata, are dened by the engineers themselves and canchange during the project.

    C. Rule Model

    In order to check if the lled attribute values are possiblywrong - i.e. incorrect, incomplete, or inconsistent - the en-gineers need a possibility to dene rules. These rules shallexpress the engineers expectations and experience regardingthe characteristics of attribute values of specic object types.The rules are considered as tree structures whereas the leavesconsist of checks on attributes. These checks can be eithersimple comparisons of attributes with values or more complexdependencies between several attributes. To give an example,a simple comparison of an attribute with a value is theexpression that the maximum allowable working temperatureof a valve has to be less than 200 degree. A similar exampleof a dependency between several attributes is the expressionthat the maximum allowable working temperature has to beless than twice the minimum allowable working temperature.The checks on the leaf-level are then combined by connectionoperators represented as internal nodes. Connection operatorsare conjunction, disjunction, and negation. Although it is

    redundant, implication should also be implemented as con-nection operator because of its frequent use. This combinationof rule-parts can be arbitrarily continued. The nal result isrepresented by the root of the tree.Additionally, we have to differentiate between intra-item andinter-item rules. The former denes attribute checks only forone object type, the latter for various object types whereasmatching partners have to identied via a matching rule. Anexample from the context of plant engineering is a rule for avalve in conjunction with its host pipe: First, associated valveand pipe instances have to be found, before a rule regardingthe compatibility of the maximum allowable working pressureof both instances can be checked.When executing the rule, instances that violate the rule arereported back to the engineer. In order to identify hugedeviations quickly and thus mark possibly urgent problems, wehave implemented interval-scaled correctness values accordingto the recommendations in [23]. That way, normalized, easilyinterpretable [0,1]-scaled correctness values help the engineersin ranking their tasks.Furthermore, we have detected that it is very important toallow for the linking of atomic rules to design steps. In ourapplication domain, many engineers asked for this possibilityin order to test rules only on instances that are in a specicdesign step. This refers to the fact that many rules are lessrestrictive in former than in later design steps.Referring to Figure 1, we implemented the rule model bymeans of ve entity types. The checks on the leaf-level arestored in the entity type Atomic Rule. If the check statesa dependency between several attributes, i.e. more than oneattribute is involved, the arbitrarily long dependency is storedin the entity type Formula and then referenced by AtomicRule. The combination of several atomic rules by meansof connection operators is stored in the entity type RuleConnection. The entity type Rule Metadata nally providesthe metadata, i.e. the name and the owning person, to everyrule connection and references a Matching Rule in case ofinter-item rules. Storing the rules by means of these ve entitytypes has several advantages: The atomic rules are storedwithout any redundancy and can be referenced in various ruleconnections. Additionally, conict tests can be executed onatomic rules. Separating the rule metadata makes it possibleto copy already stored rule connections to other persons or testthem for different matching rules. In our application domain,by means of this modeling concept, all necessary rules canbe expressed. Of course, in other application domains othermodeling concepts are possibly necessary.All entity types of the rule model belong to the category ofuser-specic metadata, are dened by the engineers themselvesand can change during the project.

    D. Obligation Rules

    For concurrent engineering, it is essential that subsequentdesign steps are supplied with preliminary and partial in-formation from previous design steps. In real engineeringenvironments, this means that newly lled attribute values

    31

  • become visible for engineers in different design steps at thesame point in time. However, the obligation of the visibleattribute values varies according to the role of the viewingperson and the current design step and object type of thecorresponding object (or instance). Because of that, obligationrules have to be stated that assign the adequate obligationtype to attributes according to the mentioned parameters. Thismakes it possible to show the obligation level of receivedattribute values to every engineer and thus preserving theengineer from basing his/her design on unreliable attributevalues.In our application domain, a differentiation between three obli-gation types fullled the engineers requirements. Thus, wedifferentiate between reliable attribute values (i.e. the attributevalues will not change again), slightly unreliable attributevalues (i.e. the attribute values will possibly change again)and highly unreliable attribute values (i.e. the attribute valueswill most likely change again, for example due to estimationissues). Of course, one can choose more ne-grained types ifnecessary. Based on these types, every obligation rule consistsof the following ve parameters: If an object of type t is indesign step x, role y has obligation z for attribute v. Obligationrules can either be stated on attribute level or on wholeattribute groups.Referring to Figure 1, we implemented the obligation ruleswith the entity types Obligation Rule and Obligation Type.Both entity types belong to the category of project-specicmetadata and have to be dened once at the beginning of theproject.

    E. Interaction Messages

    The situation that persons have to be informed, if an objectin a previous design step has changed, was already explainedin the subsection of the Design Step Model and Responsi-bility. Who exactly has to be informed is determined via thepossibility to personally adopt roles - as a ll-responsible orinterested person - for specic objects. It was also alreadymentioned above that an engineer in a subsequent designstep needs a possibility to easily and immediately contactprevious ll-responsible persons, if s/he perceives problemswith received attribute values. Without this possibility, theengineer is tempted to change the attribute value illegally byhim/herself or work around the possible error which defers itsdetection.Consequently, an easily usable interaction messaging mech-anism should be provided to the engineers to support andspeed up their communication. For traceability reasons, allsent interaction messages should be stored append-only. So,the interaction messages can be marked as done, but are neverdeleted. In our implementation, this is done with the entitytype Interaction Message. It is important to notice that thereceiver(s) of every message are automatically detected via theresponsibility metadata mentioned above. Additionally, everymessage has a certain type. In our application domain, weneed four types: forced-read, inform-about-change, inform-about-reject, and ofine-coordination-necessary. The possible

    message sequences are already predened. For example, ifengineer A conducts a problem with a received attribute valuefrom engineer B, A sends to B a forced-read message statingthe problem and asking B to check this attribute value again.B can then accept or reject the request, and a message oftype inform-about-change, or inform-about-reject respectively,is sent back to A. In case of accepting the change, the inform-about-change message is also sent to all affected engineers.Now, if any of the affected engineers rejects, the messagesequence ends with an ofine-coordination-message to allparticipants. It is obvious that the entity type InteractionMessage belongs to the category of user-specic metadata.

    F. Traceability of Rule Execution

    Whenever an engineer executes a rule, the results of the ruleexecution, the name of the executing engineer, and the date ofexecution is documented. This is necessary to trace the ruleexecution process. By aggregating these data, questions aboutthe project status can be answered. Examples are How manypossible mistakes per object type are still open? or Whichmistakes are corrected tardily after their detection?. So, timescheduling and the prioritization of tasks can be improved, e.g.by rearranging the allocation of resources.Referring to Figure 1, we implemented the documentationof the rule execution with the entity types Report, PossibleMistake, and Instance ID. The entity type Report contains thename of the executing engineer, the date of execution, and thetext and reference of the executed rule. All possible mistakesfound by the executed rule are stored in Possible Mistake,which is referenced by Report. Due to the fact that in case ofinter-item rules more than one instance is related to a possiblemistake, the instance IDs with the name of their source IMSare stored in a separated entity type, namely Instance ID. Allentity types are automatically lled, when an engineer executesa rule.

    V. DISCUSSION

    The contribution of this paper is the proposal of necessarymetadata categories to support concurrent engineering. Addi-tionally, practical recommendations regarding the implemen-tation of these categories are given. Regarding those practicalrecommendations and the entity-relationship diagram in Figure1, it is very important to notice that Figure 1 is just ourimplementation of the presented metadata categories regardingthe concrete requirements in the enterprise of our projectpartner. Figure 1 should serve as an example how the proposedmetadata categories can be implemented; however, in otherapplication scenarios this implementation probably has to beadapted. For example, design steps can also have a hierarchi-cal order and not the sequential one we have implemented.Additionally, rules can be stored in various ways. Theseissues have to be carefully discussed when implementing theproposed metadata categories in other application domains.But, the fact that the six presented metadata categories are- regardless of their concrete implementation - necessary inconcurrent engineering environments is doubtless according

    32

  • to the problem classication and analyzing activities in theenterprise of our project partner.To the enterprises, this paper raises their awareness of theneeds for successfully switching to concurrent engineering.Unfortunately, according to our experience, many enterprisesswitched from sequential to concurrent engineering withoutbuilding adequate support for the engineers. This leads tothe fact that the time- and cost-saving potential of concurrentengineering can not be used to its full capacity. Frequently, latefeedback in form of late error detections lead to a huge reworkcycle and thus compensate previously achieved time-savingsof parallelized design steps. That is why it is very importantthat enterprise become aware of the needs that concurrentengineering requires.To the data quality tool industry, this paper shows newrequirements that arise because of concurrent engineering. Dueto the fact that we have presented an IMS-neutral approach,enterprises can easily test our approach, although the toolindustry has not discovered this problem domain, yet.To the scientic community, our approach shows how togeneralize application-specic data quality problems and builda practical method to support concurrent engineering. Thisapproach is far more detailed and usable than other dataquality management approaches from the scientic literature.Hopefully, this approach motivates enterprises, researchers,and the tool industry to further extend our approach andespecially improve the detection and prevention of data qualityproblems originating from concurrent engineering.It is important to notice that our approach is completely uselessif the engineers do not use it. Thus, the interaction design ofthe user interface is very important and needs to be developedclosely with engineers. Additionally, it has to be mentionedthat in the long term the replacement of the existing IMSlandscape is still the best thing to do. The new IMS landscapehas to provide strong support for concurrent engineering, ahigh usability, and needs to be evolutionary. The latter isvery important because IMSs are ever-changing. Thus, only anhighly adaptable and evolutionary IMS landscape can endure.

    VI. CONCLUSION

    In this paper, we have presented metadata categories forsupporting concurrent engineering. We have discussed whyconcurrent engineering poses unique challenges and how anapt IT infrastructure can help overcome them. Based on thisdiscussion and on our general approach of building a metadatarepository that is independent of the existing IMS landscape,we have detailed the main contribution of this paper: Thedescription of the metadata categories that are required toovercome common problems in concurrent engineering. Foreach category, we have explained which problems may besolved with the help of the respective type of metadata andfor which group of users it provides benets. We have alsoexplained by whom each type of metadata has to be enteredand how general each type is (project- vs. user-specic).The implementation of our metadata repository is completedand has already been evaluated by a small number of end

    users. As a next step, we will extend the evaluation to a largernumber of users in order to to get more signicant resultsabout its function and usability. Next, we plan to implementthe metadata repository in a different application domain totest its adaptability to various environments.

    REFERENCES[1] J. Preece, Y. Rogers, and H. Sharp, Interaction Design: Beyond Human-

    Computer Interaction. John Wiley & Sons, Inc., 2002.[2] J. Blechinger, F. Lauterwald, and R. Lenz, Supporting the production of

    high-quality data in concurrent plant engineering using a metadatarepos-itory, in 16th Americas Conference of Informtion Systems (AMCIS),2010.

    [3] B. Prasad, R. S. Morenc, and R. M. Rangan, Information managementfor concurrent engineering: Research issues, Concurrent Engineering,vol. 1, no. 1, pp. 320, 1993.

    [4] B. Prasad, System integration techniques of sharing and collaborationamong work-groups, computers and processes, Journal of SystemsIntegration, vol. 9, pp. 115139, 1999.

    [5] A. Yassine and D. Braha, Complex concurrent engineering and thedesign structure matrix method, Concurrent Engineering, vol. 11, no. 3,pp. 165176, September 2003.

    [6] B. K. Kahn, D. M. Strong, and R. Y. Wang, Information qualitybenchmarks: Product and service performance, Communications of theACM, vol. 45, no. 4, pp. 184192, April 2002.

    [7] Y. Wand and R. Y. Wang, Anchoring data quality dimensions inontological foundations, Communications of the ACM, vol. 39, no. 11,pp. 8695, November 1996.

    [8] R. Y. Wang, A product perspective on total data quality management,Communications of the ACM, vol. 41, no. 2, pp. 5865, February 1998.

    [9] L. English, Total information quality management: A complete method-ology for iq management, DM Review, vol. 9, pp. 17, September 2003.

    [10] Y. W. Lee, D. M. Strong, B. K. Kahn, and R. Y. Wang, Aimq:a methodology for information quality assessment, Information &Management, vol. 40, no. 2, pp. 133146, December 2002.

    [11] G. Shankaranarayan and R. Y. Wang, Ip-map: Representing the man-ufacture of an information product, in Proceedings of the 5th Interna-tional Conference on Information Quality (ICIQ00), 2000, pp. 116.

    [12] M. Helfert and C. Herrmann, Proactive data quality management fordata warehouse systems - a metadata based data quality system, in Pro-ceedings of the 4th International Workshop on Design and Managementof Data Warehouses (DMDW02), 2002, pp. 97106.

    [13] D. Becker, W. McMullen, and K. Hetherington-Young, A exible andgeneric data quality metamodel, in Proceedings of the 12th Interna-tional Conference on Information Quality, November 2007, pp. 5064.

    [14] A. Kovacic, Business renovation: business rules (still) the missing link,Business Process Management Journal, vol. 10, no. 2, pp. 158170,2004.

    [15] H. Herbst, Business rules in systems analysis: a meta-model andrepository system, Information Systems, vol. 21, no. 2, pp. 147166,1996.

    [16] J. Saat, U. Franke, R. Lagerstrom, and M. Ekstedt, Enterprise architec-ture meta model for it/business alignment situations, in Proceedings ofthe 14th IEEE International Enterprise Distributed Object ComputingConference, 2010, pp. 1423.

    [17] W. Kim, B.-J. Choi, E.-K. Hong, S.-K. Kim, and D. Lee, A taxonomyof dirty data, Data Mining and Knowledge Discovery, vol. 7, no. 1, pp.8199, 2003.

    [18] P. Oliveira, F. Rodrigues, and P. Henriques, A formal denition of dataquality problems, in Proceedings of the 10th International Conferenceon Information Quality (ICIQ05), 2005, pp. 1326.

    [19] C. Batini and M. Scannapieco, Data Quality, Concepts, Methodologiesand Techniques. Springer, 2006.

    [20] W. Fan, F. Geerts, and X. Jia, Conditional dependencies: A principledapproach to improving data quality, in British National Conference onDatabases (BNCOD09), 2009, pp. 820.

    [21] J. Barateiro and H. Galhardas, A survey of data quality tools,Datenbank-Spektrum, vol. 14, no. 5, pp. 1521, 2005.

    [22] H. Dubbel, Taschenbuch fur den Maschinenbau. Springer, 2004.[23] B. Heinrich, M. Klier, and M. Kaiser, A procedure to develop metrics

    for currency and its application in crm, ACM Journal of Data andInformation Quality, vol. 1, no. 1, pp. 128, June 2009.

    33