6913 IR CDI wp - Informatica · Identifying Opportunities for Staff Training ... from relational...

16
WHITE PAPER Identity Resolution A Key to Customer Data Integration Value

Transcript of 6913 IR CDI wp - Informatica · Identifying Opportunities for Staff Training ... from relational...

W H I T E P A P E R

Identity ResolutionA Key to Customer Data Integration Value

This document contains Confi dential, Proprietary, and Trade Secret Information (“Confi dential Information”) of Informatica Corporation and may not be copied, distributed, duplicated, or otherwise reproduced in any manner without the prior written consent of Informatica.

While every attempt has been made to ensure that the information in this document is accurate and complete, some typographical errors or technical inaccuracies may exist. Informatica does not accept responsibility for any kind of loss resulting from the use of information contained in this document. The information contained in this document is subject to change without notice.

The incorporation of the product attributes discussed in these materials into any release or upgrade of any Informatica software product—as well as the timing of any such release or upgrade—is at the sole discretion of Informatica.

Protected by one or more of the following U.S. Patents: 6,032,158; 5,794,246; 6,014,670; 6,339,775; 6,044,374; 6,208,990; 6,208,990; 6,850,947; 6,895,471; or by the following pending U.S. Patents: 09/644,280; 10/966,046; 10/727,700.

This edition published November 2008.

1Identity Resolution: A Key to Customer Data Integration Value

White Paper

Table of ContentsExecutive Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

What Is Customer Data Integration? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

CDI Provides a 360-Degree View of the Customer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Identity Resolution Enables CDI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

The Triple Threat to Matching Records. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Consolidating Data in Disparate Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Handling Anomalous Data Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Identifying Opportunities for Staff Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

It’s Science, Not Magic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Tolerance of Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

A Rose by Any Other Name: The End of Yuko’s Story . . . . . . . . . . . . . . . . .10

2

Executive SummaryCustomer data integration (CDI) requires a combination of practices, techniques, and tools to effectively consolidate and manage a unique view of your organization’s customers. An effective CDI program involves identifying the sources of customer data across the enterprise and consolidating pieces of current information about a customer to provide users and application systems with an accurate, current and holistic view of that customer.

Yet how are these CDI applications able to handle the many possible errors, variations, and anomalies that prevent applications from fi nding existing customers within their systems?

A powerful concept called identity resolution exploits heuristic, phonetic, and statistical algorithms to locate matches even in the presence of variation.

This white paper explores the value that identity resolution adds to CDI.

What Is Customer Data Integration?Before we defi ne customer data integration, let’s start with a story to illustrate the need for CDI.

It’s the night before a major presentation to a FORTUNE 500 client in New York City, and Yuko Tanaka can’t get her laptop to boot up after her long fl ight from Tokyo. Fortunately, her laptop is an American brand.

She dials the toll-free support number and gives the support desk her name and the serial number from her laptop. The laptop support rep, physically located in Mumbai, types in Y-U-K-O T-A-N-A-K-A and the serial number. Nothing comes back. There’s no record of Yuko as a customer.

The support rep tries different spellings, just in case. He types in her address, the serial number, and other product details in an effort to initiate triage. With each passing minute, Yuko’s panic grows and the anxiety of the support rep increases—his performance is rated on how quickly he either ends the call or escalates it. After 15 minutes of record search strategies, Yuko has had enough—she’s upset and has gotten nowhere.

Because the laptop manufacturer has a history of making post-sale support a priority, the executive team had jumped on customer data integration as a way to integrate current information about any given customer across its operational systems. The call center was the initial application, and the company’s talented stable of engineers vied for the opportunity to learn about hub technologies by developing the call center solution.

They built a completely custom solution with feeds and speeds that matched those of the company’s fastest transaction engine. The engine’s interface allowed customer service reps to shave an average of 42 seconds off each inbound call, and acceptance tests found and correctly matched customer records with 98.1 percent accuracy.

So why can’t the call center help Yuko?

CDI Provides a 360-Degree View of the CustomerCDI is a set of procedures, controls, skills, and automation that standardize and integrate customer data originating from multiple sources.1 It is a strategy for consolidating pieces of current information about a customer to provide users and application systems with an accurate, current, and holistic view of that customer. The information can come from any source, ranging from formal systems to desktop spreadsheets.

1Jill Dyché and Evan Levy, Customer Data Integration: Reaching a Single Version of the Truth (New York: John Wiley & Sons, 2006), p. 274.

White Paper

3Identity Resolution: A Key to Customer Data Integration Value

A CDI application is intended to capture as much information as possible—not just about who your customers are, but also the different ways they can be contacted: the product they purchase, their customer support interaction histories, the relationships between customers, and many other different touch points that may occur between your organization and the customer. Figure 1 illustrates what a very simple call center CDI application might be expected to provide about Yuko.

CDI is not the same as a relational database, a customer relationship management (CRM) tool or suite, a data warehouse, or an operational data store (ODS). But it may draw customer data from relational databases, deliver operational customer data to a CRM tool, provide updates and corrections to a data warehouse, and facilitate the organization of customer data within an ODS.

Through their structure and deployment, CDI solutions add value in interesting ways:

Leverage operational data• . Because the hub technology on which CDI is built is usually segregated from operational environments, it does not compete with operational systems for resources and therefore does not degrade their performance.

Maintain all raw data• . Although data is standardized and cleansed before being loaded into a data warehouse, CDI hubs do not require transformations, allowing them to capture the variant representations of data (and consequently, their added meanings).

Select the best content• . Through the collection of many different versions of representations of individual customers, different criteria can be applied to qualify those versions, resulting in a virtual “best record” pulled from all data sources.

But most importantly, CDI solutions supply a framework for unique identifi cation of customers across—and even beyond—the enterprise application architecture.

Identity Resolution Enables CDIBy applying sophisticated algorithms, CDI applications can use a customer’s identifying information to fi nd matching records about that customer from multiple sources, regardless of structural anomalies and quality problems.

This process, called identity resolution, is the core enabling technique of customer data integration. Because the accuracy of the CDI record depends on how well all records about a particular individual are found, catalogued, and matched, identity resolution is a critical CDI success factor. Without robust identity resolution infrastructure, the result of any CDI processing will be suspect and potentially costly to the business.

Orders

Products

Maintenance

Customer

PTA0402U 012009 EB05-351

3259789-P 08012007 Yuko Tanaka 950677124

XYZ Industries ABC Division 1-6-2 Akasaka, Minato-Ku Tokyo Japan

Campaigns

Figure 1. A call center CDI application might provide a variety of data about customer Yuko Tanaka, including her purchased products, maintenance history, and address.

4

Clearly, in our example of Yuko and the support agent, the inability to locate any information about Yuko or her laptop purchase is attributable to failures in the application’s identity resolution processes. But what types of issues lead to failure to locate customer data? There are a number of very common barriers to effi cient and effective identity resolution, any one of which could affect the ability to fi nd and accurately profi le a customer.

The Triple Threat to Matching RecordsBusiness has been automating specifi c functions since the 1960s—usually starting with accounting and automating additional functions one at a time. As a result, systems tend to be self-contained, refl ecting a unique functional view of the business and the customer.

However, the popularity of data warehouse projects designated for strategic decision support and other enterprise-wide analytical activities introduces a need to understand the business from a cross-functional or enterprise perspective.

Integrating the data is not simply a matter of linking content from various systems. Incompatible data structures, inconsistent use of business terminology, and duplicate but inconsistent records all contribute to the issue, increasing apparent complexity as the data sources proliferate.

A successful CDI solution faces three main challenges:

1. Effectively consolidating customer data that resides in disparate data structures

2. Dealing with the introduction of anomalies and variations during data collection

3. Identifying opportunities for proper training in high-quality techniques for capturing data

Consolidating Data in Disparate Data StructuresTo illustrate the challenge of consolidating customer data from multiple sources, consider the sea change that started in the 1990s. As organizations began to migrate from product transaction operational systems to customer-centric ones, it became apparent that in many industries, each product category tended to have its own supporting application system. At the same time, few or none of the product application systems shared data with—or even resembled—any of the other product application systems.

For example, a bank typically managed savings accounts, checking accounts, commercial accounts, and safety deposit boxes independently from each other. Each product record contained a unique account identifi er, which didn’t match the account identifi er for any other product. And identifying information for the customer in one application’s database might not match the identifying information for the same customer in other product application data sets. At an institutional level, the bank could say with confi dence how many unique customers it had for each product, but without an intense manual effort, the bank was unable to answer even basic customer-centric questions such as:

How many customers does the bank have?•

What is the average number of bank products owned by a household?•

What is a typical portfolio of products owned by a customer?•

Which customers have personal and commercial accounts at the same branch?•

White Paper

5Identity Resolution: A Key to Customer Data Integration Value

To answer any of these questions, information relevant to each customer or household has to be identifi ed and verifi ed and the content consolidated using the best representation of each element. For example, consider how the data in Figure 2 illustrates the identity resolution challenge.

The structuring of information is often much less consistent in real life than in the example in Figure 2. However, the differences in content are suffi cient to show the challenge in the process of record searching and matching to determine whether any of these records refers to the same person and not any other Jonathan, Jackie, or Jon Jones. In this case, we cannot rely on any one fi eld for linkage or disambiguation:

Account fi eld• . The account data is inadequate because account identifi ers are designed to help the bank keep track of the product. Apparently, the bank applies a naming convention for the account number consisting of:

Product ID - Branch ID - Customer ID

Each customer ID represents unique ownership of the product described by product ID and branch ID; hence, there’s no common identifi er for Mr. Jones that transcends all banking products across all banking channels.

Address fi eld. The address is insuffi cient because multiple people may share an address and • because population mobility makes address data obsolete relatively quickly. You might think that using the address in conjunction with the name would be enough to identify an individual with near certainty. However, the proliferation of siloed source systems—so common in today’s business environments—exacerbates inconsistent content about customers. In other words, because records in one system might be updated while others might not, you may not know which address is correct or current. When one of the country’s largest gas and electric utilities re-engineered its business systems in the late 1980s, the company discovered more than 100 separate supplier records for IBM. The company name included such variations as “International Business Machines,” “I.B.M.,” “I. B. M.,” and “ibm.” There were almost as many different business addresses as records, representing everything from IBM headquarters to divisional offi ces to distribution centers.

In the example in Figure 2, Mr. Jones’ childhood savings account still lists his parents’ address; the statements are delivered there and Mr. Jones receives them. He ignores the account, and he’s not been motivated to update the address.

Figure 2. Variations in identity data obscure the single view of the customer.

Online Checking

Personal Savings

Personal Savings

Commercial Checking

7A-301-22451

11-301-13421

1A-492-8874112

SB3-001-9865KL

Jones

Jones

Jones

Jonathan

Jackie

J

J

Jon

Jones Concrete & Masonry

124 Oxford

32 Selby Lane Atherton, CA 94027

124 Oxford

PO Box 62

Redwood City

Product Category Account LastName

FirstName

MI Street City Zip

Redwood City

San Francisco

94061

94061

94111

6

Name fi elds• . Even used together, name fi elds are unreliable for many reasons. For example, some individuals have more than one name, while different people might share the exact same name. The longer a customer stays with a company and purchases additional products, the more likely variations in his name will be recorded. Figure 3 presents some commonly occurring name variations.

Identity resolution is also a challenge in other industries. For example, hotel desk clerks and call center operators focus on serving the customer quickly and, therefore, tend not to ask for correct spelling or to make sure they enter any more than the minimum information required to get the system to accept a record—particularly if their compensation is tied to transaction duration. As a result, the data in many systems is “dirty” and can further thwart quick and accurate identity resolution.

Handling Anomalous Data EntryCompanies in nearly every industry struggle with accuracy in data entry. Records are commonly corrupted by the most innocent mistakes and lapses in judgment. Simple examples include:

The operator makes a typographical or spelling data entry error such as “San Francisoc” instead • of “San Francisco,” or “Shivaun” instead of “Siobhan,” for example.

The customer provides a different version of his name, doesn’t know his account number, or has • moved since the last transaction.

The system automatically inserts a default value—for example, a date like 1/1/0 or January 1, • 1900—whenever the date fi eld is left blank.2

Detail data and notes are entered in the wrong fi eld.•

Let’s look at an example of this last point. There is substantial data entered into the wrong fi eld when implementing national change of address (NCOA) processing. The USPS issues address updates quarterly based on notifi cations people submit when they move. Software vendors certifi ed by the USPS then issue the updates to their customers. Customer records are matched in customer or marketing systems with updates from the USPS using a complex algorithm and aligns the address format with USPS standards to facilitate mail delivery.

Nicknames

Name Variation

Abbreviations/Spelling

Foreign Versions

Spelling Variation

Suffix Variation

Initials/Order

Variation Types

Anglicization

Out of Sequence

Titles

William, Bill, Billy, Will

Chris, Kris, Christie, Krissy, Christy, Christine, Tina

Mohammed, Mohd, Mohamad, Mhd, Muhammad

Peter, Pete, Pietro, Piere, Pierre

Johnson, Johnsen, Johnsson, Johnston, Johnstone, Jonson

Smith II, Smith I I, SMith Jr, Smith jnr

Frank Lee Adam; A. Frank Lee; Lee Frank

Examples

De La Grande, Delagrande, D L Grande

Henry Tun Lye Aun; Mr.Aun Tun Lye (Henry)

Dr. Henry Lee, Henry Lee, M.D., Mr. Henry Lee

Figure 3. Variations in common names make name fi elds unreliable.

2Dyché and Levy, Customer Data Integration, p. 89.

White Paper

7Identity Resolution: A Key to Customer Data Integration Value

We’ve seen ADDRESS LINE 2 fi elds abused in particular; customer-facing agents commonly insert credit card, customer preference, customer complaint, and offer usage information because they’d have to go to another screen to enter the information in the right place or they don’t know there’s a fi eld set aside for free-form notes. Tragically, all of that customer information is wiped out when NCOA runs and aligns the address fi elds with USPS standards.

As with structural issues, anomalous data across records makes accurate searching and matching much more diffi cult. To overcome these diffi culties, good identity resolution products rely on business rules and a hybrid of statistical methods to assess the degree to which a set of records can be determined to refer to the same person.

Identifying Opportunities for Staff TrainingCustomer-facing staff are typically trained to deliver friendly and effi cient service, and they are under pressure to deliver that service above all else. However, because information collection is seen as secondary to service delivery, these staff members may not be aware how the information they collect adds value across the organization. As a result, they may not:

Bother to ask if the customer is returning—and therefore should already be known to the • company’s systems

Take time to search for the customer in the system if the initial name or spelling doesn’t return • an existing record

Confi rm input, including spelling•

Even worse, they may “game” the system to bypass data validation routines deliberately implemented to protect the quality of data at the point of entry, all in the interest of saving the customer time. Typical examples include entering dummy values in required fi elds; nine zeroes for a Social Security number and 10 for phone numbers are common examples of this sort of workaround.

Such lapses in judgment are more a response to stress—a long line of waiting customers, for example—and tunnel vision about their contribution to the greater business process. Training and monitoring can help reduce these types of data problems, as can aligning performance incentives with data quality objectives.

But as long as there are multiple data sources, there will be anomalies that make identity resolution challenging.

It’s Science, Not MagicHow do vendor solutions overcome all of the barriers to effective identity resolution? They use various established heuristic, phonetic, and multivariate statistical methods specially tailored to measure and test the relevance and likely meaning of variables used to search and match records.

Commercial searching demands have driven the need to fi nd proverbial “needles in haystacks” for users quickly and easily. Companies trying to perfect search-and-match solutions have developed great depth and precision by applying sophisticated and effective string manipulation and statistical methods to matching problems. The process that has evolved is called approximate or “fuzzy” matching. Everything from the Google search engine to Microsoft Word’s spell check function is including fuzzy matching to suggest reasonable alternatives to misspellings or concepts that can’t be recognized. The choices are offered based on the probability that the alternative is really what you’re trying to fi nd.

8

Identity resolution integrates a set of techniques for viewing identifi cation data in a number of different ways and using those potential variations to create an indexing scheme that can help narrow searches.

For example, a person’s name can be seen as a string of characters, but can also be viewed as a sequence of phonemes. Approximate matching techniques can be applied to evaluate how closely two character strings match. Statistical analysis of word frequency provides weighting characteristics for data values pulled from multiple attributes.

All of these techniques contribute to the development of a similarity score that characterizes how closely two records match. By setting thresholds for the similarity scores, an application can automate the identity resolution process.

Tolerance of ErrorsEven in the best case, there are still situations where erroneous matches or mismatches occur. Because business risks vary, the error tolerance in identity resolution will vary accordingly. Identifying risks and defi ning acceptance thresholds can only happen in conjunction with the business client; partnering IT with the business side provides the right knowledge, skill, and accountability.

Two types of errors can occur as part of identity resolution. One, referred to as a “Type I” or “false positive” error, occurs when it is concluded that a set of records refer to the same customer when they really do not. The other, referred to as a “Type II” or “false negative” error, occurs when it is concluded that two or more records refer to different customers when they actually refer to the same customer. Figure 4 shows examples of both error types.

Figure 4. Both “false positive” and “false negative” identity resolution errors are common.

White Paper

9Identity Resolution: A Key to Customer Data Integration Value

A false positive will induce a CDI record to include irrelevant information that may lead you to treat the target customer differently than you would if there were no error. And with a false negative, a CDI integrated record is skewed by missing information about the target customer and may therefore also lead you to treat that customer differently than you intend.

Consider the examples in Figure 5 and their potential effect on customer experience, brand, fi nancial return, and other key performance indicators.

Clearly, a business wants to invest in perfection, whenever its ability to keep the doors open is threatened. Absent perfection, the gaming corporation would be better off committing a Type I error when it comes to identifying disassociated players, whereas the modular home manufacturer might prefer to commit Type II errors when segmenting its customers for discount programs.

The Need for SpeedAs if the challenge of fi nding and accurately matching records among volumes of transactional records from multiple sources isn’t enough, the time frame for completing the activity is getting shorter and shorter. The root issues include latency, completeness, and timeliness—the need to capture all transactions associated with the target customer, up to the most recent, as quickly as possible.

The U.S. Department of Homeland Security, for example, needs to process the most current travel plans for any given passenger against a number of internal watch lists, as well as against watch lists from relevant foreign countries. Its window for checking is between passenger check-in and fl ight takeoff. There’s no time to wait for an overnight batch run if the objective is to keep terrorists off of commercial fl ights.

Online Checking

Personal Savings

Personal Savings

Commercial Checking

A national modular home manufacturer discovers that their $18 operating budget variance was the result of applying a volume discount to several low-volume customers.

I Damage to relationships with both low-volume customers and high-volume customers, and to status on Wall Street after making adjustments

Coal Smoke Power pays out $1M in fraudulent electric conservation incentive awards.

A wealth management customer enrolls in online bill pay but gets confused trying to set up payments and never activates the service.

Her personal banker is not aware that she has not activated the service, and doesn’t follow up.

I

II

Bad publicity, political backlash in response to the next rate case, and increase in fraudulent claims.

Missed opportunity to provide a service the customer wants.

Missed opportunity to decrease operating costs.

A former customer now on a state’s Disassociated Players List redeems a casino offer from a recent marketing campaign. Disassociated Players are to be excluded from campaigns and anything else that might entice them to try and enter a casino.

II Suspension of gaming license;State could close down the casino.

Product Category Account ErrorType

Risk

Figure 5. Common identity resolution errors can compromise a company’s key performance indicators.

10

At the other end of the risk spectrum, business trends in globalization and networked commerce similarly demand rapid analysis of up-to-the-minute information about a customer. As Figure 6 shows, Amazon and eBay offer familiar examples of highly networked merchandising mediated by a complicated but exacting infrastructure of business rules designed to protect each partner’s interests while enabling effi cient and hassle-free shopping for customers. Much of the “hassle-free” quality of the experience depends directly on rapid identity resolution that satisfi es all partners.

Identity resolution is a challenge for any one enterprise, but collaboration exponentially increases the diffi culty of getting it right—particularly in the amount of time a customer will tolerate. As volumes increase and tolerance for processing time decreases, and the complexities of partner organizations are added to the mix, homegrown identity resolution infrastructures are less practical. For example, each partner may have a loyalty program and campaign offers that need to be identifi ed and credited correctly. There are too many nuances about data and statistics for most generic application developers to succeed.

A Rose by Any Other Name: The End of Yuko’s StoryThe inconsistencies described so far exist in just about any organization that has automated more than one of its business functions. The complexity of the problem varies from organization to organization, but global operations, outsourcing, and strategic partnerships are adding fuel to the fi re universally. To return to the story that opened this white paper, this is why the support center could not help Yuko.

CUSTOMER

SHIPPING

AMAZON.COM

ORDER PROCESSING

PAYMNET PROCESSING

SideStep

TireRack Performance Specialists

Fidelity Investments

Figure 6. Networked commerce requires an infrastructure of business rules and identity resolution.

White Paper

11Identity Resolution: A Key to Customer Data Integration Value

Global operations introduce new languages and alphabets, many of which are diffi cult to translate without loss of meaning. Figure 7 shows an example of the challenges of translating among the Roman, Arabic, and Thai alphabets.3

In addition, not all languages are written using alphabets. Syllabaries symbolically represent complete syllables. Japanese, Chinese, and a number of Native American languages (e.g., Cherokee, Ojibwa, and Blackfoot) are good examples of syllabaries.

Any company doing business offshore, particularly in Africa and the Asia Pacifi c, is challenged to resolve identity across languages and character sets that do not offer one-to-one correspondence. A truly global operation may need an identity resolution infrastructure that competently operates across hundreds of languages and character sets.

In the case of Yuko, her laptop manufacturer had developed its identity resolution infrastructure internally at its U.S. headquarters. The manufacturer had accounted for language translation but not multiple character sets. Because Yuko’s laptop was purchased in Japan, her records were in Japanese and her Anglicized name could not be matched against the CDI application, as Figure 8 illustrates.

If the laptop manufacturer were using an identity resolution infrastructure that could match Yuko’s name in Japanese or in English, it still would not recognize Yuko as the same person across the two character sets. Once again, we see that the complexities of identity resolution are beyond the skill sets most organizations have in house.

The Internet creates a truly global marketplace. As the world gets smaller and the amount of information that businesses rely upon gets larger, the number of international names entering the data stream grows exponentially. Tools are needed to cover this dynamic range of linguistic diversity.

3Refer to www.omniglot.com.4Upper- and lowercase characters are separate representations. Some languages using the Roman alphabet have

additional characters; for example, ß has dropped out of English but is still used in German.5Japanese content provided by Hiroko Yoshiya Berkey

Arabic

Roman

Thai

Alphabet

28

52

44

Characters

Right-to-left

Left-to-right

Consonants: left-to-rightVowels: Above, below or to the right of the consonant, depending on its phonetic characteristics

Orientation

End of word

End of word

End of sentence

Meaning of Space

4

Figure 7. Translating among different alphabet types to resolve identity data issues can be challenging.

Roman

Japanese

Yuko Tanaka

田中 裕子

1-6-2 Akasaka Minato-tu, Tokyo

東京都港区赤坂 1-6-2

Name Address

5

Figure 8. Without identity resolution, customer records in different languages and alphabets cannot be matched.

12

Identity Resolution and the MarketBy now, you probably see the wisdom of acquiring an identity resolution infrastructure from a qualifi ed vendor. Identity resolution is a critical success factor for each and every CDI application. What do you need to know to make a good choice?

Business executives and IT stakeholders alike tend to underestimate the complexity of effective and effi cient solutions, and they are unaware of the sophisticated data knowledge and statistical skills required to create one. In addition, the growing volume of Internet content is driving rapid development and improvements in identity resolution products.

Purchasing an identity resolution solution is the best practice for several reasons:

Through product upgrades, your CDI applications benefi t from improvements in identity • resolution approaches and technology.

Your IT organization avoids resource-consuming maintenance and support of identity resolution • infrastructure.

Most commercial identity resolution solutions enable business subject matter experts to • manage business rules governing matching and error thresholds, whereas homegrown solutions tend to have hard-coded business rules. With hard-coded rules, IT must be engaged whenever the rules need changing—a guaranteed bottleneck in business agility.

Because commercial identity resolution solutions are designed to scale rapidly and on the fl y • due to Internet growth patterns, your architectures are much more likely to scale to support new CDI applications and rapid growth in existing CDI applications without frequent redesign or reconfi guration.

Removing custom identity resolution development from the CDI initiative means that your fi rst • CDI application will be operational much sooner.

The good news is that there are identity resolution vendors with robust products on the market today. Some provide identity resolution in conjunction with other data management functionality, while others focus only on a best-in-class identity resolution engine.

The best choice may hinge on the maturity of your data environment. Organizations that have already invested in tools that support data modeling, defi nition, profi ling, mapping, and quality management (and data professionals who know how to use these tools effectively) are positioned to invest in a best-in-class identity resolution solution.

Other organizations that are just starting to integrate data probably have few data professionals and tools in place. With lots of investment decisions to make, such organizations may be better off purchasing a multipurpose tool that includes identity resolution that may be less robust than a best-in-class solution.

As the data environment matures and the organization develops the skills to benefi t from more robust tools, some of the functions can be moved off the multipurpose tool and onto a best-in-class solution. And identity resolution might be a candidate for such a transition.

White Paper

13Identity Resolution: A Key to Customer Data Integration Value

Learn MoreInformatica® Identity Resolution™ is robust, highly scalable identity resolution software that enables companies and government organizations to search and match identity data from more than 60 countries, in both batch and real time. Learn more about Informatica Identity Resolution and the entire Informatica product platform. Visit us at www.informatica.com or call 800.653.3871.

About InformaticaInformatica enables organizations to gain a competitive advantage in today’s global information economy by empowering them to access, integrate, and trust all their information assets. As the independent data integration leader, Informatica has a proven track record of success helping the world’s leading companies leverage all their information assets to grow revenues, improve profi tability, and increase customer loyalty.

About the AuthorLinda McHugh is a consultant with Baseline Consulting. She holds an M.Ed. from the University of Arizona and a Ph.D. from the University of Wisconsin–Madison.

Baseline Consulting is an acknowledged leader in the data integration and business analytics industry and helps large and midsized businesses enhance the value of enterprise data, improve business results, and achieve self-suffi ciency in managing and using data as a corporate asset. Over half of Baseline’s clients are FORTUNE 1000 companies. Baseline offers business consulting and technical implementation services in four practice areas: Business Analytics, Data Warehousing, Data Management, and Data Integration. Founded in 1991 and headquartered in Los Angeles, California, Baseline’s only business is mastering data.

For more information, please call 818.906.7638, email [email protected], or visit www.baseline-consulting.com.

Worldwide Headquarters, 100 Cardinal Way, Redwood City, CA 94063, USAphone: 650.385.5000 fax: 650.385.5500 toll-free in the US: 1.800.653.3871 www.informatica.com

Informatica Offi ces Around The Globe: Australia • Belgium • Canada • China • France • Germany • Japan • Korea • the Netherlands • Singapore • Switzerland • United Kingdom • USA

© 2008 Informatica Corporation. All rights reserved. Printed in the U.S.A. Informatica, the Informatica logo, and The Data Integration Company are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners.

6913 (10/29/2008)