Organizing Information Digitally Norm Friesen. Overview General properties of digital information...

28
Organizing Information Digitally Norm Friesen

Transcript of Organizing Information Digitally Norm Friesen. Overview General properties of digital information...

Page 1: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Organizing Information Digitally

Norm Friesen

Page 2: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Overview

• General properties of digital information

• Relational: tabular & linked• Object-Oriented: inheritance &

modularity• Markup: serial & hierarchical

Page 3: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

General Properties

• Multiple Axes and access points– Allow for different views

• Form & Content can (should) be separate

• Formatting can be used for analysis & organization of data

• Instructions and data can be combined;– effects of instructions are difficult to control

• Database software for each type

Page 4: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Examples

• Relational: library catalogue, Amazon.com, hotel reservation system

• Markup: Web pages & Google, Blogs & RSS,

• Object-Oriented: programs of all kinds; Windows XP, Office, etc. Java Programming langauge

Page 5: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Relational

• Tables and links• Table: “a systematic arrangement of

data usually in rows and columns for ready reference”

• Represents a category or example, rather than a specific instance of that category. Entities can be thought of (roughly) as nouns.

Page 6: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.
Page 7: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Deriving tables from text

• Tabs, commas and hard returns (paragraphs) are often used to indicate rows and columns in a table

• Data in this format often called “flat files.”

• Can be used as a way of getting data “into” a database: make a list into a database table

Page 8: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Relational, con’t

• An entity described in a table can be related to other entities– E.g. person and membership card(s)

• This relationship can be: – One to one– One to many– Many to many

Page 9: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Primary Key

• Primary Key: a field that uniquely identifies each record stored in a table. This field is often automatically numbered; it cannot contain any empty, blank or null values.

Page 10: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Definition: Relation

Relation: A connection between two tables, each describing an entity that interacts with the other. In the example above, users (described in the first table) compose and send messages (described in the second table). The values for the primary key for one of these entities is stored in two places: in its own table, and as a foreign key in the related table.

Page 11: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.
Page 12: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Many to Many: Junction Table

Page 13: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Activity: a 2-Table database

• Think of examples

• Look at examples for the database application project

• Include primary and foreign key

• Make sure that you use the correct relation type

Page 14: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Relational Data: Other Characteristics

• Particular means of querying: SQL or Standard Query Language – ISO/IEC 9075; Information Technology - Database

Languages

• Not good at representing complex relationships and some kinds of entities/data– Complexity can sometimes be accommodated at

the price of performance– Multimedia not easy to accommodate

Page 15: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Object-Orientation

• Way of organizing and conceptualizing information largely for the purposes of programming

• Programming: the creation of step-by-step list of instructions written for a particular computer environment in a particular language.

Page 16: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Object Orientation: Characteristics

• Modular: Black boxes with a standardized interface; encapsulation

• Classes and inheritance: part of producing and modifying program components

• Operation: what the object can do

Page 17: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Object Orientation: Modular

• Bugs tend to arise from unexpected consequences of relations between parts of a program– Simplify relations by defining modular

program components that relate to one another through clearly defined interfaces.

– Programmers and program components only deal with the interface, not the module or object contents.

Page 18: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Object Orientation: Classes

• A class is a pattern, template, or blueprint for a category of structurally identical items. The items created using the class are called instances. This is often referred to as the "class as a `cookie cutter'" view. As you might guess, the instances are the "cookies.”

(http://www.toa.com/pub/oobasics/oobasics.htm)

Page 19: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Object Orientation: Inheritance

• “In an object-oriented context, we speak of specializations as "inheriting" characteristics from their corresponding generalizations. Inheritance can be defined as the process whereby one object acquires (gets, receives) characteristics from one or more other objects.”

Page 20: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Object-oriented Databases

• data is stored as objects it can be interpreted only using the methods, usually specified by its class. The relationship between similar objects is preserved (inheritance) as are references between objects.

Page 21: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Object oriented Databases

• Doesn’t translate well into SQL data: Object-SQL Impedance Mismatch

• “As an industry, ODBMS were long considered to be a lost opportunity to revolutionize software development. Since 2004, object databases have seen a renaissance when open source object databases appeared…”

Page 22: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Markup Languages

• Markup refers to the use of a markup language to describe the structure and appearance of a particular document. – HTML: describes the appearance of

documents– XML: geared to the description of the structure of documents

– There are many types of documents, so many derivatives from XML exist

Page 23: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Markup, con’t

• Used for both documents and records

• Both XML and HTML derived from SGML, “Standardized General Markup Language” (1960’s). A language for formulating languages– XML (1996): a simplified subset of SGML– HTML (1992): very simplified subset;

XHTML conforms to XML

Page 24: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Markup, con’t

<title>A Tale of Two Cities</title>SERIAL & HIERARCHICAL:

<image> <title>Stephen's Web</title>

<url>http://www.d.ca/r.gif</url> <link>http://www.downes.ca</link> <width>90</width> <height>36</height>

</image> (validation)

Page 25: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

XML

• OpenDoc: for office documents• Doc book: for manuals• XrML: for enforceable copyright statements• RSS: for news/posting syndication• MathML: for formatting mathematical

formulations• RuleML: expressing formal rules for

processing information, etc.

Page 26: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

DTD/Schema, Document, XSLT

Page 27: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

XML, con’t

• Repetition of elements within repetitions.

• XML databases– Relational/hybrid– “Native”– XQuery

Page 28: Organizing Information Digitally Norm Friesen. Overview General properties of digital information Relational: tabular & linked Object-Oriented: inheritance.

Summary

• Three forms of organizing information• Each is flexible and powerful, but only within

specific domains/purposes• Most widespread database technologies are

relational• But the other two forms (markup and object-

oriented) do not translate easily into this format.