Chapter 03 it-8ed-volonino

40
Data, Text, and Document Management C hapter 3 3-1 Copyright 2012 John Wiley & Sons, Inc. Management Information Systems EIMBA Part II. Data and Network Infrastructure

Transcript of Chapter 03 it-8ed-volonino

Page 1: Chapter 03 it-8ed-volonino

Data, Text, and Document Management

Chapter 3

3-1Copyright 2012 John Wiley & Sons, Inc.

Manage ment Informati on SystemsEIMBA

Part II. Data and Network Infrastructure

Page 2: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Chapter 3 Outl ine

3.1 Data, Text, and Document Management

3.2 File Management Systems

3.3 Database Management Systems

3.4 Data Warehouses, Data Marts, and Data Centers

3.5 Enterprise Content Management

3-2

Page 3: Chapter 03 it-8ed-volonino

Chapter 3 Learning Objecti ves

Describe data, text, and document management, and their impacts on performance.

Understand file management systems.

Understand the functions of databases and database management systems.

Describe the tactical and strategic benefits of data warehouses, data marts, and data centers.

Copyright 2012 John Wiley & Sons, Inc.3-3

Page 4: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

F o r C l a s s D i s c u s s i o n & D e b a t e Wendy's International Relies on Text Mining for Customer Experience Management

Scenario for Brainstorming & Discussion (see book for full text)

1. Select an industry, company, or public sector.

2. Identify costs due to ignorance about customers’ or constituents’ experiences.

3. Explain how your selection could benefit from text analytics that provided feedback within 24 hours.

4. Compare and assess your answers with others in class.

3-4

Page 5: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Debate (see book for full text)

• Select one side of the argument, as described in the textbook.

• Debate whether investments in text message collection and mining should be made even if no ROI can be determined in advance.

• Provide convincing arguments either in favor of or against the investment in text message collection and mining.

3-5

Page 6: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

3.1 Data, Text, and Document Management

Data, text, and documents are strategic assets. Vast quantities are:• created and collected• then stored – often in 5 or more locations

Data, text, and document management helps companies improve productivity by insuring that people can find what they need without having to conduct a long and difficult search.

3-6

Page 7: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Data Management

Why does data management matter? • No enterprise can be effective without high quality data

that is accessible when needed.

• Data that’s incomplete or out of context cannot be trusted.

• Organizations with at least 1,000 knowledge workers lose ~ $5.7 million annually in time wasted by employees reformatting data as they move among applications.

What is the goal of data management?• To provide the infrastructure and tools to transform raw

data into usable information of the highest quality.

3-7

Page 8: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Data Management

Why is data management difficult and expensive?

• Volume of data is increasing exponentially.

• Data is scattered throughout the organization.

• Data is created and used offline without going through quality control checks.

• Data may be redundant and out-of-date, creating a huge maintenance problem.

3-8

Page 9: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Current key issues

Master data management (MDM): Processes to integrate data from various sources and enterprise apps in order to create a unified view of the data.

Document management system (DMS): Hardware and software to manage, archive, and purge files and other electronic documents (e-documents).

Green computing: Efforts to conserve natural resources and reduce effects of computer usage on the environment.

3-9

Data Management

Page 10: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

IT at Work 3.1 – Healthcare SectorData Errors Cost Billions of Dollars and Put Lives at Risk

Every day, healthcare administrators and others throughout the healthcare supply chain waste 24% --30% of their time correcting data errors.

Each incorrect transaction costs $60 to $80 to correct.

About 60% of all invoices among supply chain partners have errors, and each invoice error costs $40 to $400 to reconcile.

Each year, billions of dollars are wasted in the healthcare supply chain because of supply chain data disconnects.

3-10

Page 11: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

IT at Work 3.1 (continued)

Data Errors Cost Billions of Dollars and Put Lives at Risk

Benefits from data synchronization in the healthcare sector and supply chair:

• Easier and faster product sourcing because of accurate and consistent item information

• Significantly reduces the amount of fraud or unauthorized purchasing

• Reduces unnecessary inventories

• Lowers prices because purchase volumes became apparent

• Improves patient safety

3-11

Page 12: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.3-12

Figure 3.2 Data life cycle

Data management is a structured approach for capturing, storing, processing, integrating, distributing, securing, and archiving data effectively throughout their life cycle.

Page 13: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Data problems

3-13

Page 14: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Data principles

Principle of diminishing data value. The more resent the information, the more valuable it is

Principle of 90/90 data use: 90% of data is seldom accessed after 90 days.

Principle of data in context. Investment in DM infrastructure may be huge.

3-14

Page 15: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

IT at Work 3.4

Check page 69

3-15

Page 16: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Transforming data into knowledge

Text mining and analytics:• Exploration• Preprocessing• Categorizing and modeling

3-16

Page 17: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.3-17

Figure 3.4. Model of an Enterprise Data Warehouse

Data from various sources are extracted, transformed, & loaded (ETL) into a data warehouse; then used to support functions and apps throughout the enterprise.

Page 18: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

3.2 File Management Systems

Computer systems organize data into a hierarchy: bits, bytes, fields, records, files, and databases

3-18

Figure 3.6 Hierarchy of data for a computer-based file.

Page 19: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Limitations of the File Environment

When organizations began using computers, they started with one application at a time, usually accounting, billing, and payroll. Each app was designed to be a stand-alone system, which led to data problems.

Data problems with a file environment:• data redundancy• data inconsistency• data isolation• data security

3-19

Page 20: Chapter 03 it-8ed-volonino

• Stand-alone systems result in data redundancy, inconsistency, and isolation.

• Database management systems helped solve the data problems of file-based systems.

3-20Copyright 2012 John Wiley & Sons, Inc.

Page 21: Chapter 03 it-8ed-volonino

Figure 3.10 Database management system provides access to all data in the database.

3-21Copyright 2012 John Wiley & Sons, Inc.

Page 22: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

3.3 Database Management Systems (DMBS)

Numerous data sources• clickstream data from Web and e-commerce applications• detailed data from POS terminals• filtered data from CRM, supply chain, and enterprise

resource planning applications

DBMS permits an organization to centralize data, manage them efficiently, and give application programs access to the stored data.

3-22

Page 23: Chapter 03 it-8ed-volonino

a) Centralized databaseb) Distributed database with

complete or partial copies of the central database in more than one location

3-23

2 types of databases:

Copyright 2012 John Wiley & Sons, Inc.

Page 24: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Functions of a Database Management System (DBMS)

Data filtering and profiling: Inspecting the data for errors, inconsistencies, redundancies, and incomplete information.

Data quality: Correcting, standardizing, and verifying the integrity of the data.

Data synchronization: Integrating, matching, or linking data from disparate sources.

Data enrichment: Enhancing data using information from internal and external data sources.

Data maintenance: Checking and controlling data integrity over time.

3-24

Page 25: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

3.4 Data Warehouses, Data Marts, andData Centers

Data warehouse: a repository in which data are organized so that they can be readily analyzed using methods such as data mining, decision support, querying, and other applications. • enable managers and knowledge workers to leverage enterprise data to

make the smartest decisions

• enable OLAP (online analytic processing)

Data marts: designed for a strategic business unit (SBU) or a single department.

Data centers: facilities containing mission-critical ISs and components that deliver data and IT services to the enterprise.

3-25

Page 26: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.3-26

Figure 3.11 Data warehouse framework and views.

Page 27: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Characteristics of a data warehouse

Organization. Data are organized by subject (e.g., by customer, vendor, product, price level, and region), and contain information relevant for decision support only.

Consistency. Data in different operational databases may be encoded differently. For example, gender data may be encoded 0 and 1 in one operational system and “m” and “f” in another. In the warehouse they will be coded in a consistent manner.

Time variant. The data are kept for many years so they can be used for trends, forecasting, and comparisons over time.

Nonvolatile. Once entered into the warehouse, data are not updated. Relational. Typically the data warehouse uses a relational structure. Client/server. The data warehouse uses the client/server architecture mainly to

provide the end user an easy access to its data. Web-based. Today’s data warehouses are designed to provide an efficient

computing environment for Web-based applications (Rundensteiner et. al., 2000).

3-27

Page 28: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.3-28

Page 29: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Building an Enterprise Data Warehouse (EDW)

A company that is considering building a DW first needs to address a series of basic questions to avoid a failure:• Does top management support the DW? • Do users want access to a broad range of data• Do users want data access and analysis tools?• Do users understand how to use the DW to solve business

problems?• Does the unit have one or more power users who can

understand DW technologies?

3-29

Page 30: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.3-30

Figure 3.12 Teradata Corp.’s EDW

Page 31: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Suitability

Data warehousing is most appropriate for organizations that have some of the following characteristics:

End users need to access large amounts of data

Operational data are stored in different systems

The organization serves a large, diverse customer base

The same data are represented differently in different systems

Extensive end-user computing is performed

3-31

Page 32: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

3.5 Enterprise Content Management

ECM includes: electronic document management Web content management digital asset management, and electronic records management (ERM)

3-32

Page 33: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.3-33

Figure 3.13 Electronic records management from creationto retention or destruction

Page 34: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Unstructured business records

Businesses generate volumes of documents, messages, and memos that, by their nature, contain unstructured content that cannot be put into a database.

Many of these materials are business records that must be retained and made available when requested by auditors, investigators, the SEC, the IRS, or other authorities.

To be retrievable, business records must be organized and indexed.

Records are not needed for current operations or decisions, are archived—moved into longer-term storage.

3-34

Page 35: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Business Value of E-Records Management

Companies need to be prepared to respond to an audit, federal investigation, lawsuit, or other legal action against it.

• Examples of lawsuits: patent violations, fraud, product safety negligence, theft of intellectual property, breach of contract, wrongful termination, harassment, and discrimination

E-discovery is the process of gathering electronically stored information in preparation for trial, legal or regulatory investigation, or administrative action as required by law.

• When a company receives an e-discovery request, the company must produce what is requested—or face charges of obstructing justice or being in contempt of court.

3-35

Page 36: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Companies have incurred huge costs for not responding to e-discovery

Failure to save e-mails resulted in a $2.75 million fine for Phillip Morris.

Failure to respond to e-discovery requests cost Bank of America $10 million in fines.

Failure to produce backup tapes and deleted e-mails resulted in a $29.3 million jury verdict against UBS Warburg in the landmark case, Zubulake v. UBS Warburg.

3-36

Page 37: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.3-37

Page 38: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.3-38

Page 39: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Exercise

Visit Analysis Factory at analysisfactory.com Click view the interactive business solution dashboards.

Select one type of dashboard and explain its value or features.

3-39

Page 40: Chapter 03 it-8ed-volonino

Copyright 2012 John Wiley & Sons, Inc.

Chapter 3 Link Library

Advizor Solutions, data analytics and visualization http://advizorsolutions.com/

Clarabridge: How Text Mining Works http://clarabridge.com/

SAS Text Miner http://sas.com/

Tableau data visualization software http://tableausoftware.com/data-visualization-software/

EMC Corp., enterprise content management http://emc.com

Oracle DBMS http://oracle.com/

3-40