14 - 1 Chapter 14 The Second Component: The Database.

29
14 - 1 Chapter 14 The Second Component: The Database
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    222
  • download

    0

Transcript of 14 - 1 Chapter 14 The Second Component: The Database.

14 - 1

Chapter 14

The Second Component:

The Database

14 - 2

The Importance of the Database

• The database is where an organization stores content for instantaneous retrieval when needed– Data, documents, pictures, or anything that can be

represented in a computer is stored in the database manipulated by a DBMS

• A business depends on information/knowledge to operate and the database is one of its most valuable resources

• Make technology so powerful - search at incredible speeds

• How?– Retrieve?– Correlate?– Go beyond data mining to semantics?

14 - 3

Storage of Data

• The first computers stored information on paper tape, punched cards, and magnetic tape

• The development of the magnetic disk changed processing dramatically

• RAM - flash memory - secondary storage• Programs bring data from the disk (or secondary

memory) into primary memory for processing– Disk access however is up to a million times slower

than primary memory access

14 - 4

14 - 5

Database Management Software (DBMS)

• DBMSs automate tasks associate with using direct access files

• DBMS administrators – describe the database and its required indexes– define records

• Individual programs ask for specific pieces of data– only programs that access a piece of data need to be

changed when that data changes

14 - 6

DBMS Requirements

• A DBMS must provide the following– A method for defining the contents of the database– A way to describe relationships among data elements

and records– A mechanism to set up the database in the first place– Ways to manipulate the data including• Updating (adding, modifying, and/or deleting

information)• Retrieval

14 - 7

Benefits of the Relational Database Model

• Data are organized in two dimensional tables which are easy to develop and understand

• The structure can be described mathematically – each table represents a relation

• Columns from tables can be extracted and even joined

• Relational databases are easy to use

14 - 8

14 - 9

An Example

• Records consist of related data fields– Student number, student last name, student first name,

address line 1, address line 2, city, state, zip code, phone number

• Each field consists of one data element and the size of the field is the same for each record

• Index or key fields make it easier to search records– Student ID number

14 - 10

Microsoft Access RDBMS

• Can create relations and add data to them– E.g., Student and Class

• Inquire for information based on criteria• Join relations on some key

14 - 11

14 - 12

14 - 13

14 - 14

14 - 15

14 - 16

14 - 17

Object-Oriented Databases

• Traditionally relational databases supported a limited number of data types– Alphabet, numeric, dates, and time

• Modern organizations use a variety of data– Graphics objects, audio clips, videos, subscripted

arrays, and complex data for data mining

• RDBMS vendors have extended their packages to handle such data objects

14 - 18

Structured Query Language (SQL)

• A retrieval language for users• Basic structure of a SQL expression– The select clause lists the attributes desired in answer

to a query– The from clause is a list of relations or tables that the

query language processor should consult in filling the request

– The where clause describes the attributes desired in the answer

• SQL is used as an intermediary and a standard in accessing several different database systems

14 - 19

Oracle: An Enterprise DBMS

• Oracle DBMS architectures are server-centric• Extended relational data model that supports

many different data types and uses SQL for queries

• Typically supports thousands of users, processes terabytes of data, and integrates with Oracle application packages in financial management, supply chain management, manufacturing, etc.

14 - 20

Distributed Databases

• Different parts of the database are located on different computers in a network

• Issues distributed databases raise are– Will data be replicated across computers or will their be

only one copy– If data are replicated, how frequently must different

versions be updated to reflect changes– How will updates to the database be coordinated so that

integrity is maintained– Who “owns” distributed data and who has access to it– distributed databases offer users easier access to data

at the cost of higher overall complexity of the system

14 - 21

The Data Warehouse

• Businesses collect a tremendous amount of transactions data from routine operations

• These data can be analyzed to understand the business better– Requires multidimensional analysis called Online

Analytical Processing (OLAP)– Helps create a learning organization that is better able to

understand its markets, customers and itself

14 - 22

14 - 23

14 - 24

14 - 25

Data Mining

• Discovers interesting structure in large amounts of data

• This structure consists of– Patterns– Statistical or predictive models of the data– Relationships between the data

• Applied extensively to customer data– Allows firms to determine for instance which products

sell together

14 - 26

Reasons for Data Mining

• Increasing business unit and overall profitability• Understanding customer desire and needs• Identifying profitable customers and acquiring new ones• Retaining customers and increasing loyalty• Increasing ROI and reducing costs on promotions• Cross-selling and up-selling• Detecting fraud, waste, and abuse• Determining credit risks• Increasing web site profitability• Increasing store traffic and optimizing store layouts• Monitoring business performance

14 - 27

Approaches to Data Mining

• Visualization - graph can tell a lot• Statistical techniques• Search and optimization• Artificial intelligence (e.g., neural networks)

14 - 28

Databases and the Organization

• The typical organization has many databases– Some are organized while others are loose collections

of information

• A manager is responsible for the creation, maintenance, and protection of data

• Databases are the firm’s memory and allow it to remain in business

• Provide with incredible opportunities

14 - 29

Summary

• Organizations keep tremendous amounts of machine readable data

• Data in files are stored in records which consist of fields which contain groups of characters

• The DBMS automates the task of setting up a database

• The relational model is the most dominant DBMS model today

• Data warehouses and data mining can contribute significantly to firm success