RELDES/2i1417/2i1071/2i4217 autumn term 2006 – nikos dimitrakas 1 Temporal Databases Multimedia &...
-
date post
20-Dec-2015 -
Category
Documents
-
view
215 -
download
1
Transcript of RELDES/2i1417/2i1071/2i4217 autumn term 2006 – nikos dimitrakas 1 Temporal Databases Multimedia &...
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
1
Temporal DatabasesMultimedia & Databases
DB2 Extenders
nikos dimitrakaswww.nikosdimitrakas.com
[email protected] 6626
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
2
Temporal Databases
• Time– Transaction time
When something is registered in the database
– Valid time
When something occurs in the real world
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
3
Database Taxonomy
• Snapshot databases– no time – just the current state
• Historical databases– only valid time – records the full history/evolution of
concepts in the database
• Rollback databases– only transaction time – records all previous states of the
database
• Temporal databases– both valid and transaction time – combines the
advantages of both historical and rollback databases.
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
4
Database Taxonomy
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
5
An Example
Consider the following trivial information:
• A database of people and their salary.
PID Name Salary
001
002
…
Bill
John
…
7500
8100
…
Snapshot database:
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
6
Snapshot Database
PID Name Salary
001
002
…
Bob
John
…
7500
8100
…
We can only see the current value. We do not know when the value was entered in the database nor whether it has ever been changed.
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
7
Historical Database
PID Name Salary StartTime EndTime
001
001
001
002
002
…
Bob
Bob
Bob
John
John
…
6500
8000
7500
7900
8100
…
Jan 1998
Feb 1999
Dec 1999
May 1999
Mar 2001
…
Feb 1999
Dec 1999
<now>
Mar 2001
<now>
…
We can see the current value for any point in time. We still do not know when the value was entered in the database nor whether it has ever been changed (error correction).
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
8
Historical Database error correction
PID Name Salary StartTime EndTime
001
001
001
001
002
002
…
Bob
Bob
Bob
Bob
John
John
…
6500
7500
8000
7500
7900
8100
…
Jan 1998
Feb 1999
Apr 1999
Dec 1999
May 1999
Mar 2001
…
Feb 1999
Apr 1999
Dec 1999
<now>
Mar 2001
<now>
…
If we discover an error in the database we have the possibility to change it for that period of time.If we find out that Bob earned 7500 between February 1999 and April 1999 we can modify our data to the following:
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
9
Rollback Database
PID Name Salary TransTime
001
001
001
002
002
…
Bob
Bob
Bob
John
John
…
6500
8000
7500
7900
8100
…
Mar 1998
Feb 1999
Sep 2001
Mar 2001
Sep 2001
…
We can see all the values that have been valid in the database. We know when the values were entered in the database, but we do not know when they were valid. We can not change values in any of the records. We can only add new records.
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
10
Rollback Database interpretation
PID Name Salary TransTime
001
001
001
002
002
…
Bob
Bob
Bob
John
John
…
6500
8000
7500
7900
8100
…
Mar 1998
Feb 1999
Sep 2001
Mar 2001
Sep 2001
…
We can always see which value was current in the database at any point in time. It can be an advantage to know what errors were contained in the database at a certain time!
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
11
Temporal Database
PID Name Salary StartTime EndTime TranTime
001
001
001
002
002
…
Bob
Bob
Bob
John
John
…
6500
8000
7500
7900
8100
…
Jan 1998
Feb 1999
Dec 1999
May 1999
Mar 2001
…
Feb 1999
Dec 1999
<now>
Mar 2001
<now>
…
Mar 1998
Feb 1999
Sep 2001
Mar 2001
Sep 2001
…
We can see the valid value for any point in time. We can see when a value was entered in the database.We can see what errors have been corrected and when.
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
12
Temporal Databaseerror correction
PID Name Salary StartTime EndTime TranTime
001
001
001
002
002
001
001
…
Bob
Bob
Bob
John
John
Bob
Bob
…
6500
8000
7500
7900
8100
7500
8000
…
Jan 1998
Feb 1999
Dec 1999
May 1999
Mar 2001
Feb 1999
Apr 1999
…
Feb 1999
Dec 1999
<now>
Mar 2001
<now>
Apr 1999
Dec 1999
…
Mar 1998
Feb 1999
Sep 2001
Mar 2001
Sep 2001
May 2003
May 2003
…
Errors can be corrected but the transaction time shows when they where corrected. We can therefore still see that until May 2003 the database ”believed” that Bob earned 8000 between February 1999 and Dec 1999.
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
13
QueryingSnapshot Historical Rollback Temporal
How much does Bob earn?
7500 7500 7500 7500
How much did Bob earn in March 1999?
cannot be expressed
7500 maybe 8000
7500
What was Bob’s current salary in March 1999?
cannot be expressed
cannot be expressed
8000 8000
What did we think that Bob earned in March 1999 (vt) a year ago (tt)?
cannot be expressed
cannot be expressed
cannot be expressed
7500
What did we think that Bob earned in March 1999 (vt) three years ago (tt)?
cannot be expressed
cannot be expressed
cannot be expressed
8000
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
14
Pros & Cons
• Pros– No missing data
• Cons– Requires more space
– More complicated to update records
– More complicated to query the database
• Workarounds– Horizontal segmentation (simplifies querying of current data)
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
15
Multimedia
• Types– Images, graphics
– Audio
– Video
– Text documents
– ??
• Aspects– Time
– Size
– Interpretation
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
16
Time & Dimensions
• Text– Time independent (discrete)
– 1 dimension
• Audio– Time dependent (continuous)
– 1 dimension
• Image– Time independent
– 2 dimensions (height & width)
• Video– Time dependent
– 3 dimensions (height, width & time)
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
17
Managing & Effects of Size
• Multimedia in file system (Reference in database)
• Multimedia in database– Stored separately
– Mixed with “normal” data
• Storage choice affects performance– slower access to multimedia faster access to normal data
• Vertical segmentation
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
18
Applications
• Travel industry
• Entertainment industry• Medical databases• Text and photograph archives• Digital libraries• Electronic encyclopedias• Geographic information systems• Shopping guides• …
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
19
Requirements
• Querying– On
» Content dependent data
» Content descriptive data
» Content independent data
• Retrieval– Fast retrieval
– Smooth retrieval for time dependent media
• Presentation– This can be placed outside the DBMS
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
20
Multimedia Metadata• Data about the multimedia objects
– Content dependent dataData that can be derived from the contents of the multimedia objectThe color of the bird in the picture.The lyrics of the song.
– Content descriptive dataData associated with the contents of the multimedia object, but cannot be automatically identifiedThe breed of the bird in the picture.The singer’s sex.
– Content independent dataData that is associated with the multimedia object, but does not relay to its contentsThe name of the photographerThe name of the song writer, the brand of the microphone used.
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
21
Querying needs• Find all images taken by J. Smith!
• Find all images with the same color, shape and texture as this image!
• Find all images which look like this image!
• Find all images with the same color distribution like this sunset photograph!
• Find all images which contain a car!
• Find all images which contain a car and a man who looks like this!
• Find all the songs that are about surfing!
• Find all documents that have the words church and song in the same sentence!
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
22
Querying needs• Find all videos with big explosions!
• Find a video where there is an explosion after a car chase!
• Find an instrumental audio clip with both violin and electric guitar!
• Show me the first 10 seconds of all the videos!
• Show me the 20 seconds exactly before the explosion in videos that contain explosions!
• Find all songs that are duets and end with a fade-out!
• Find all images with a dog standing next to a tree!
• …
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
23
Query Types• Queries on the content of the media information
– Find all images which contain a car!– Find all videos with big explosions!– Find an instrumental audio clip with both violin and electric
guitar!• Queries by example (QBE)
– Find all images which look like this image!• Time indexed queries
– Show me the first 10 seconds of all the videos!• Spatial queries
– Find all images with a dog standing next to a tree!– Find a video where there is an explosion after a car chase!– Find all documents that have the words church and song in
the same sentence!• Application specific queries
– Queries on content descriptive/content independent data (?)• Combinations of the above
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
24
Querying Multimedia vs. Normal Data
Multimedia Data Normal Data
Query Matching Approximate Exact
Sorting By Relevance User Specified
Examples:•All people that live in Stockholm sorted by name.•All pictures like this one.
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
25
Multimedia Analysis
• Identify features– There is a river in the picture– There is a violin playing– Blue eyes, brown hair
• Manually add features– The person on the picture is Indian– The dog’s name i Barky
• Identify relations– The car is on the left of the house– The person is inside the car– The saxophone solo was before the guitar solo
• Manually add relations– The dog is owned by Howard– James employs Roger
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
26
Other Considerations
• Similarity searches
• Ranking results
• Weighing conditions
• Synonym management
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
27
Example
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
28
IBM DB2 & Multimedia• Built-in support for
– Basic content dependent data
– Query by example (image only)
– Spatial queries (text only)
– Returning only complete objects
• Possible to define other tables and columns for features and relations
• Possible to define functions
• Possible to extend (with programming)
• Multimedia Storage– in the database
– outside the database
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
29
DB2 Extenders & Assignments
• Text– Exact searches
– Linguistic searches
– Spatial searches
– Synonym searches
– Similarity searches
– Ranking results
• Audio, Video, Image– Metadata searches
• Image only– Query by example
– Color related searches
– Similarity searches
– Ranking
• Storage– In the database
» mixed with other data (Text)
» in separate tables (Multimedia)
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
30
DB2 Extenders & Labs
• Text– Create database
– Load data
– Enable for text indexing
– Query database
• Audio, Video, Image– Create database
– Enable for multimedia
– Load data
– Query database
We will not extract the actual multimedia objects, since that would require some presentation facilities which are not included in DB2
This would be an interesting project for those interested in learning more about working with multimedia and wouldn’t mind doing a little programming.
Project course: IS8/2i1410 Current Problems in Information Systems
More on Text Retrieval: ISBI/2I1068/2I4078 Internet Search Techniques and Business Intelligence
RE
LD
ES
/2i1
417 /
2 i10
71/2
i421
7 a u
t um
n t
erm
200
6 –
nik
os
dim
itr a
k as
31
Large Objects vs. Performance testing in DB2
1. Create a table with multimedia and normal columns!
2. Load the table with data!
3. Write a query for extracting only normal data (Varchars, Integers, etc.)!
4. Evaluate your query with an Access Plan!
5. Create a similar table only with the normal columns!
6. Load the same data to the new table!
7. Rewrite the query so that it uses the new table!
8. Evaluate the new query!
9. Compare the two results!