Relational databases

Post on 21-Aug-2015

290 views 0 download

Tags:

Transcript of Relational databases

Relational databases

Introduction

Introduction

• Most common used data format next to spreadsheets.

• Spreadsheets relatively easily• Research projects mostly claim data to be

stored in relational database.• Understanding a relational structure opens

the access to many data

Relational databases - Data mining

• Exploration of data• Prerequisite: data should be available in a

minable format - database• Database = electronic document storing data

– Non-relational: 1 bulk system with non-related items (eg. Msexcel files, text-documents, non-related-tables)

– Relational: all items (tables) are linked to each other (see further)

Relational databasesWhy using a database

• Relational database:– All your data is stored in 1 file

• Easy to retrieve data• Easy to backup

– Data and metadata stored together• Data ...• Metadata: data about the data (documentation)

– Many data-files contain undocumented values:– Species A has an abundance of 17 ( meaning of value 17?)

Relational databasesWhy using a database

• All data in a good relational designed database is only stored once:– Example: species list typing errors

• Nudora thorakista• Nudora thorrakista• Nudora thorakhista • Nudora thorakisa

– 1 species species richness calculation: 4– Solution: 1 table with each species 1 record and

use it as a reference

Why using a database

• Data is much more rigid ...– More difficult to make errors– E.g. Sorting in excell

Relational databasesPrinciple - Exercise

• A practical example to understand ...– Make a list of 15 people you know– Make a list of all genders– Make a list of characters and indicate for each

character whether nice or not– Make a list of countries

• Start coupling all your lists• You made a relational database

Relational database - biologySpecies

person

Places

Sample

Country

Density

Equipment

Speciesperson

Places

Sample

Country

Density

Equipment

Which person was present on samplings in sweden?

Speciesperson

Places

Sample

Country

Density

Equipment

Which species sampled with a core occur in densities higher than 40

Variable

Var_value

Taxonomy

Photo

Literature...

...

...

...

Relational databasesPrinciples

• Think before you start ...– Structure of a database is the key to a good

dataset– Structure has to translate the whole concept

• One look at the structure (relational scheme) should explain the database

Relational databases - components

• Tables– Basic structures containing the data– Structure of table important– ID

• Relations– Definition of how different tables are connected

and form a sense-full unit• Queries

– Extractions of data from database

Table designs ...

• A table consists of a series of Columns ...• Each record as such:

– Different fields– Design of table must be done before data is entered– Each field: name, data type– Each field can also by formatted layout

Record

ColumnField

Table designs ...

• Field types:– Numeric – integer/double– Text – Date/Time– Memo– Autonumber ID– Yes/No

Excercise on field types:

• 12• 15 jan 1988• hallo• 12,456• 12:56• Azdazdazd azdda zda azdd dad zd dadazdzd azdazddazdd azdazd azdazd dzdzdzzd ada zzd azdaz

dda azd da az d z azdzadazd a zd a azd azd z dd da a z a z zd d ddaa zd

• 09:89

Special field in a table: key

• A key = a unique identifier for a record– Example: pasport number:

• Number in a database which is unique and relates to all data about you

– Each record in a table gets also a key– This key is used to link tables to each other– Example:

• Nudora sp1 – id: 123776• Nudora sp2 – id: 34688

– Advantage: species name changes: linked taxa remain linked

Linking tables through id’s

• Storing numbers is most effecient way to store data:

• Nudora sp1 is found in the north sea with a density of 32

• Species 123776 is found in station 2 (North sea) with a density of 32

• Record in table density becomes:123776 | 2 | 32

Setting up relations between tables

• Relations: links between tables• Connecting tables through certain fields in a

rigid way to each other• Advantage: database becomes a strong unity• Types of relations:

– 1 to many– Many to many ( = 2 times 1 to many)

Examples of relations

• Table places: field country (numeric)• Table countries – list of countries, each country has unique id• Relation is made between:

– Field country in places– Field id in country

• One to many relation: 1 record in table country linked to multiple records in places

• No deleting of countries possible

Places

Country

Examples of relations

• Many to many• Id of sample• Id of species

• Table density: unique combination of sample, species ...

Species

Sample

Density

Queries

• All data in database:– Next step: get it out again– Selections on 1 table: by using filters– Selections on multiple tables: using queries– Queries can be saved and reused– Queries can be the basis for new queries

Sorting on tables

• Sorting

Filtering on tables

Making a simple selection Query

• Create ... Query in design view• Switching between views:

Making a simple selection Query

• Select the tables and/or queries needed

Making a simple selection Query

• Select the fields needed for output/selection/sorting

Making a simple selection Query

• Select the fields needed for output/selection/sorting

Making a simple selection Query

• Select the fields needed for output/selection/sorting

Making a simple selection Query

• Select the fields needed for output/selection/sorting

Making a simple selection Query

• Set the criteria

Making a simple selection Query

• Select the values to out put and add sorting options

Output the results

• Go to datasheet view

Making a simple selection Query

• Special options ...

Exporting data

• From msaccess it is possible to export to different formats!

• Tables, queries, ...• Exports can be used to do further data mining:

– Through MSExcell making graphs– To do statistical analysis

Exporting data

Step by step demonstration

• Open a database• Different items in database• Open tables, sorting, filtering• Table design• Relationships• Queries

Query operators

= equals> Larger than< Smaller than>= larger than or equalsBetween ... And ... Is nullLike ...Not like ...

Query operators

Query operators

and both trueor at least 1 true< Smaller than>= larger than or equalsBetween ... And ... Is nullLike ...Not like ... >"q*" and <"u*" VOORNAAM René, Robbie, Stefan, Stijn, Tim, Tristam

="r*" or "s*" VOORNAAM Robbie, Stefan, Stijn

Intermezzo ... Design a dataset

• Research project:– You work with 3 persons on it– You will sample 4 times on 3 locations– You will measure 5 environmental characteristics– You will identify all species– You will count them– Extra: you will measure each specimen– Task: design on paper how your dataset will look

like