What is a database? An organized collection of data. This can be in an electronic, paper, or other...

17
database? zed collection of data. This can be in an electronic, paper, or oth atabases tional - constantly changing because entries are dynamic. Example customer purchases and inventory control database ytical - once data are collected, they remain static. This is typi scientific databases cy - Also known as inherited database. Created by someone else ted- My own term for a database you create ved- Database you create by importing another database

Transcript of What is a database? An organized collection of data. This can be in an electronic, paper, or other...

Page 1: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

What is a database? An organized collection of data. This can be in an electronic, paper, or other format.

Types of databases Operational - constantly changing because entries are dynamic. Example is

customer purchases and inventory control database

Analytical - once data are collected, they remain static. This is typical of scientific databases

Legacy - Also known as inherited database. Created by someone else

Created- My own term for a database you create

Derived- Database you create by importing another database

Page 2: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

Flat file databasesThis is commonly the way we first view “databases”. Spreadsheets, word processing documents or simple ASCII files are common examples

Page 3: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

Flat file databasesThis is commonly the way we first view “databases”. Spreadsheets, word processing documents or simple ASCII files are common examples

Order ID Order Date

Ship Date

Sales Rep Customer Item 1 Quantity Item 2 Quantity ….

1 10 May, 2003

11 May 2003

Jim MSU Plankton Splitter

1 Ekman Dredge

1 ….

2 May 11, 2003

11 May 2003

Jim Michigan State Ekman Dredge

2 Plankton Splitter

2 ….

3 5/12/2003 11 May 2003

Bill, Jim M.S.U. Plankton net 3 ….

4 5/12/03 11 May 2003

Jim, Bill That other school in Ann Arbor

Zooplankton net

1 ….

Page 4: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

This example shows a lot of problems. For example, -Very constrained - only two items allowed per order -Lacks ability to search easily (e.g., finding a specific item ordered is difficult and not always robust) -Lacks database integrity. For example, MSU is not represented consistently

The first and most critical concept is that of a relational database where the data are stored in multiple tables when necessary

Associated with this is the key idea that the data may be stored in adifferent format than how we view the data

We will get back to these ideas again (probably more often than youwould like!)

Page 5: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

Overview of Database Design Process

1. Goals and objectives for database2. Analyze current database3. Create data structure4. Establish table relationships5. Define business rules6. Establish views7. Review data integrity

One of the key points is that this is an iterative process – you may need to go back to earlier steps if you find problems

Page 6: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

Example of Database Design Process

-Introduction to example data set

1. Goals and objectives

Goal is to be able to determine the catch and size distribution of individual fish species at specific sites or groups of sites in our research program. We also want to be able to describe habitat conditions at these sites and relate them to the fish catches

Objectives:1. To be able to compute catch per effort for each species at individual sites, and for the above barrier sites and for the below barrier sites as a group2. To be able to compute mean size for each species at individual sites, and for the above barrier sites and below barrier sites3. ...

Page 7: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

2. Analyze current database

In this case, we have data sheets already filled in, so we will use this to analyze our current (paper) database

Begin by describing how data are collected. During this process, focus on units of observation (entities) or sampling events, and descriptions or measurements.

Create list of all variables (attributes), entities and events

Associate every variable with one or more entity or event

Page 8: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

Water flow

Barrier1

1

2

2

3

3

Within a site

Transect 1Width, Depth,50 substrateparticles

Transect 2

Transect 3

Page 9: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

Variables Entities or EventsStream name ShockingFish species caught HabitatFish lengthSample datePosition (Above or Below Barrier)Treatment or Reference StreamSegment ID number (=site)Length of segmentCrew membersConductivityWater TemperatureWeather ConditionsWater ConditionsTransect widthTransect depthTransect ID numberParticle size

Page 10: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

Refinements

Variables Entities or EventsStream name ShockingFish species caught HabitatFish species caught (Common name, Streams scientific name, family) TransectsFish length SubstrateSample date Year, Month, DayPosition (Above or Below Barrier)Treatment or Reference StreamSegment ID number (=site)Length of segmentCrew members (always three)ConductivityWater TemperatureWeather Conditions (Cloud Cover, Precipitation)Water Conditions (Water color, Water height)Transect widthTransect depthTransect ID numberParticle size

Page 11: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

From this preliminary set of entities and descriptors, develop preliminary list of tables and fields

TABLES- contain information on a particular entities or events FIELDS - describe the attributes of entities or eventsRECORD- contains the information or data on an individuals

Page 12: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

Characteristics of a “Good” Field

• It represents a characteristic of the subject of the table• It contains only a single value (e.g., if had two instructors for a

course, the instructor field should not contain both names). This is in contrast to MULTIVALUED FIELDS.

• It can not be broken down into smaller components (e.g., the entire address for a person can be broken down into street address, city, state, zip code). This is in contrast to MULTIPART FIELDS.

• It does not contain a calculated value. Fields which are determined by values in other fields are CALCULATED FIELDS.

• The field is unique within the database unless it is needed to link tables

• The field retains all its characteristics if it appears in more than one table

Page 13: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

Characteristics of a “Good” Table

• Each table refers to a single class of entities or unit of observation or event

• There is a way to uniquely identify each entry in a table. This is called the PRIMARY KEY.

• It does not contain multipart, multivalued, or calculated fields.

• It does not contain unnecessary fields, or unnecessary redundant data

• It contains all of the fields necessary to link it to other tables you want to link (or relate) it to

Page 14: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

Stream Table Stream ID Stream Name Barrier or Reference

Shocking Event Table Stream ID Position (above/below) Segment Date Crew Segment Length Conductivity Water Temperature Weather Water ConditionsHabitat Transect Table

Stream ID Transect number Width Depth ???Substrate???

Fish Table Stream ID Position (above/below) Fish name Length Total Catch

First Cut at Developing Tables

Page 15: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

Stream Table Stream ID Stream Name Barrier or Reference

Shocking Event Table Stream ID Sampling Event ID Position (above/below) Segment Date Crew Segment Length Conductivity Water Temperature Weather Water Conditions

Habitat Transect Table Stream ID Sampling Event ID Transect number Width Depth ???Substrate???

Fish Table Stream ID Sampling Event ID Position (above/below) Fish name Fish species code Length Total Catch

Refinements to Tables

Substrate Table Sampling Event ID Transect number Particle ID Particle size code

Page 16: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

Another example: Deer habitat use in SE Michigan

Habitat patches-size-cover type

Deer characteristics-Deer ID-age-sex

Telemetry observation-Year-Month-Day-Time-Deer ID-Habitat patch (or lat/lon ?)

Page 17: What is a database? An organized collection of data. This can be in an electronic, paper, or other format. Types of databases Operational -constantly changing.

Homework

• Develop list of tables and fields for your database project

• With a partner, go over your list to determine if each table and field meets the criteria for being “good”