Θέματα Συστημάτων Βάσεων Δεδομένων

Post on 11-Feb-2016

37 views 0 download

Tags:

description

Θέματα Συστημάτων Βάσεων Δεδομένων. Ιστορία, Παρόν και Μέλλον του χώρου των Βάσεων Δεδομένων Πάνος Βασιλειάδης pvassil@cs.uoi.gr Σεπτέμβρης 2003. www.cs.uoi.gr/~pvassil/courses/readings/. Topics. Yesterday Today Tomorrow - PowerPoint PPT Presentation

Transcript of Θέματα Συστημάτων Βάσεων Δεδομένων

Θέματα Συστημάτων Βάσεων Δεδομένων

Ιστορία, Παρόν και Μέλλον του χώρου των Βάσεων Δεδομένων

Πάνος Βασιλειάδηςpvassil@cs.uoi.gr

Σεπτέμβρης 2003

www.cs.uoi.gr/~pvassil/courses/readings/

2

Topics

YesterdayTodayTomorrow

Part of these slides come from Prof. Timos Sellis’ course – many thanx!

3

Topics

YesterdayTodayTomorrow

4

History of the field of databases

Late 60's: network (CODASYL) & hierarchical (IMS) DBMS.

Low-level “record-at-a-time” DML, i.e. physical data structures reflected in DML (no data independence)

1970: Codd's paper -- the relational model. The most influential paper in DB research.

Set-at-a-time DML. Data independence. Allows for schema and physical storage structures to change under the covers. Truly important theory, led to "paradigm shift" in thinking and in practice.Papadimitriou: "as clear a paradigm shift as we can hope to find in computer science".Turing award

5

History of the field of databases

early-to-mid-70'sraging debate between the two camps."great debate" in 1975

mid 70's: 2 full-function (sort of) prototypesIngresSystem RAncestors of essentially all today's commercial systems

6

History of the field of databases

Ingres: UCB 1974-77a ``pickup team'', including Stonebraker & Wong early and pioneering. Led to Ingres Corp (CA), Sybase, MS SQL Server, Britton-Lee, Wang's PACE.

System R: IBM San Jose (now Almaden)15 PhDs. Led to IBM's SQL/DS & DB2, Oracle, HP's Allbase, Tandem's Non-Stop SQL. System R arguably got more stuff ``right''

Both were viable starting points, proved practicality of relational approach. Beautiful example of theory -> practice!!

7

History of the field of databases

early 80'scommercialization of relational systems

mid 80'sSQL becomes “intergalactic standard”.DB2 becomes IBM's flagship product.IMS “sunseted”

8

History of the field of databases

90’s: the age of maturitynetwork & hierarchical essentially dead (though commonly in use!)relational becomes mainstreamimprovements in terms of transactional facilities, performance and stabilityScale, scale, scale…

9

Scale, scale, scale…

EOSDIS*: 1 Tb/day, keep it all for 15 years (they need tertiary storage for that)

*NASA’s Earth Observing System Data and Information System

WalMart: 365 node system, 6Tb online, 4billion row table, 200million updates daily, 4000 queries/day, 1500 users/week, 4 min DS response time w/ avg. 60000 rows

Databases make the world go round, mainly due to their ability to handle HUGE amounts of data, RELIABLY!!!

Large scale is our business…

10

History of the field of databases

Late 90’s: object relational & the webSQL-1999 & early implementationssupport for ADT’s RDBMS’s as back-end for internet front-endsApplication Servers and middleware

11

Topics

YesterdayTodayTomorrow

12

VLDB 2003

The International Conference on Very Large DataBases (VLDB) is the top database conference. The 29th VLDB conference was held in Berlin, Germany in Sept. 2003.

To accommodate the wide spectrum of papers, VLDB 2003 was organized into three tracks: 

Core Database System Technology Infrastructure for Information Systems· Industrial Applications & Experience

http://www.vldb.informatik.hu-berlin.de/

13

VLDB 2003 – from the CfP“The Core Database Technology PC will evaluate papers that

report on technology that is meant to be incorporated in the database system itself. This includes database engine functions, such as query languages, data models, query processing, views, integrity constraints, triggers, access methods, and transactions in centralized, distributed, replicated, parallel, mobile, and wireless environments.

It also includes extended data types, such as multimedia, spatial and temporal data, and system engineering issues, such as performance, high availability, security, manageability, and ease-of-use. Papers on all aspects of active and object databases, storage technology, and data management system architecture should be submitted to the Core Database Technology PC.”

14

VLDB 2003 – from the CfP

“The PC covering Infrastructure for Information Systems will evaluate papers that report on methods, issues, and problems faced during the design, development and deployment of innovative solutions for information management.

Examples include workflows, advanced transaction processing features, application servers, object monitors, services in support of E-commerce, mediators and other web-oriented data facilities, metadata repositories, data and process modeling, web services, user interfaces and data visualization, data translation and migration, data cleaning, multi-agent systems, and system management.”

15

VLDB 2003 – from the CfP

“The PC on Industrial Applications & Experience solicits submissions covering innovative commercial database implementations, novel applications of database technology, and experience in applying recent research advances to practical situations. The track is VLDB's way to foster the exchange of ideas and solutions between research and industry. Application areas include those of Bioinformatics/Life Science, Engineering, Mobile Systems, Enterprise Resource Planning (ERP), and other areas all of which pose technical challenges to the field of data management.”

16

VLDB 2003

Submissions By Track:Core 249 Infrastructure 162 Industrial 46

Grand Total 457 Accepted: 84 (70 research, 1:6)

The field is flourishing … getting your paper accepted is hard (nice excuse)!!

17

VLDB 2003

(98) Optimization and Performance (84) Advanced Search, Query, and Approximation (70) Semi-structured Data, XML (64) Internet and WWW Databases / Query Systems (63) Access Methods (44) Data Mining and Knowledge Discovery (32) Infrastructure Challenges and Opportunities (30) Databases and database services: Internet and the WWW (30) Novel / Advanced Database Applications (29) Data Integration / Federation / Mediation (29) Information Retrieval with Database Systems (29) Middleware Data Architectures (29) Special Purpose DB Techn.: Multidimensional Databases … miscellaneous other topics …

18

Topics

YesterdayTodayTomorrow

19

The Lowell report -- 2003

Senior database researchers gather every few years to assess the state of database research and to recommend problems and problem areas that deserve additional focus. The previous meetings were held in Laguna Beach, Ca. in 1989, in Palo Alto, Ca. (Lagunitas) in 1990, in Palo Alto, Ca. (Lagunitas II) in 1995, and at Asilomar, Ca. in 1998.The sixth ad-hoc meeting was held May 4-6, 2003 in Lowell, Mass., USA.

http://research.microsoft.com/~Gray/Lowell/

20

Issues for future research

(data)Bases for everythingInformation FusionMultimedia QueryingUncertain data & PersonalizationData MiningPrivacy & Trustworthy Systems New User Interfaces100 year storage

21

… no more data bases ……, it is time to stop grafting new constructs onto the traditional

architecture of the past. Instead, we should rethink basic DBMS architecture with an eye toward supporting:

Structured dataText, space, time, image, and multimedia dataProcedural data, that is data types and the methods that encapsulate themTriggersData Streams and queues

as co-equal first class components within the DBMS architecture both its interface and its implementation rather than as afterthoughts grafted on a relational core.

The participants were adamant that one should start with a clean sheet of paper.

22

Issues for future research

Information Fusion: Therefore, one must perform information integration on-the-fly over perhaps millions of information sources. … the thorny problem of semantic heterogeneity remains …Multimedia Querying: … to create easy ways to analyze, summarize, search, and view the “electronic shoebox” of a person’s multimedia information. Uncertain data: …query processing must move from a deterministic model, where there is an exact answer for every query, to a stochastic one, where the query processor performs evidence accumulation to get a better and better answer to a user query.

23

Issues for future research

Data mining: users … wish for tools that generate some “pearls of wisdom”.A challenge for data mining research is to develop algorithms and structures for sifting through the databases looking for such pearls, while running in background and consuming excess system resources. Another important challenge is to integrate data mining with database querying, optimization, and other facilities such as triggers.

24

Issues for future research

Privacy: our community can work on security systems that include a component dealing with the prospective use to which the data will be put. Access decisions should be based not only on who is requesting the data but also on what use it will be put to. New User Interfaces: There is a crying need for better ideas in this area. PV: Major Issue!!!

25

Issues for future research

100 year storage: even archived information is disappearing, because it was captured on a medium that is deteriorating (e.g. photographic film or magnetic tape) or because it was captured on a medium that requires obsolete devices (e.g. special storage drives), or because the application that is needed to interpret the information no longer works (e.g. troff). [we need] mechanisms for migration, to copy information from deteriorating or obsolete media, and for emulation, to capture methods that can interpret information that is stored for long periods (e.g. troff renderer)