A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

25
<odesi> A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007

Transcript of A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

Page 1: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

<odesi>A Journey in Data Discovery

Wendy WatkinsTSES 300130 October, 2007

Page 2: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

<odesi> (Ontario Data Documentation, Extraction Service and

Infrastructure)

• A project to provide data on a browser-based platform

• Uses the DDI (Data Documentation Initiative) international standard for metadata

• Written in XML so it can be read across multiple platforms– Text-based representation that can be preserved (ascii)

• Allows searching across datasets and servers

• Provides an easy-to-use interface for beginning researchers

Page 3: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

<odesi> • A $1.04 million project funded by OntarioBuys (province) and OCUL

(University Libraries)• Labour-intensive project requiring the input of extensive metadata to

provide better access and preservation• A collaborative effort between Carleton and Guelph• Work is being done by co-op students from both universities• Will include:

– 55 years of Gallup Canada datasets (1945-2000)– About 250 Statistics Canada surveys– Canadian election surveys from 1965-2006– Other polling data– Data from the Inter-university Consortium for Political and Social

Research at Ann Arbor• Data will be mounted on Scholars’ Portal• XML files will be shared across the country

Page 4: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

<odesi>

• Will build on previous work • CRIC files are already in this format• Exposes undergraduates to the research

enterprise at an early stage in their careers • Is important in developing numeracy skills• Focus is on understanding what the data show,

not on the formulae• More will be available at Carleton by the next

term

Page 5: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

<odesi>

• Uses the Nesstar software• Metadata is first put into the publisher• Can be as rich as you want to make it• Can be exported as an XML file

– If something better comes along, the new software will read the file

• Collaboration through sharing the work

Page 6: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

click

• click

Page 7: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

click

• click

Page 8: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

click

• click

Page 9: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

click

• click

Page 10: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

click

• click

Page 11: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

click

• click

Page 12: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

click

• click

Page 13: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

<odesi>

• Will allow data to be shared by all with access to the web

• Because of its format, evaporation will be avoided

Page 14: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

<odesi> (Ontario Data Documentation, Extraction Service and

Infrastructure)

• A project to provide data on a browser-based platform

• Uses the DDI (Data Documentation Initiative) international standard for metadata

• Written in XML so it can be read across multiple platforms– Text-based representation that can be preserved (ascii)

• Allows searching across datasets and servers

• Provides an easy-to-use interface for beginning researchers

Page 15: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

<odesi> • A $1.04 million project funded by OntarioBuys (province) and OCUL

(University Libraries)• Labour-intensive project requiring the input of extensive metadata to

provide better access and preservation• A collaborative effort between Carleton and Guelph• Work is being done by co-op students from both universities• Will include:

– 55 years of Gallup Canada datasets (1945-2000)– About 250 Statistics Canada surveys– Canadian election surveys from 1965-2006– Other polling data– Data from the Inter-university Consortium for Political and Social

Research at Ann Arbor• Data will be mounted on Scholars’ Portal• XML files will be shared across the country

Page 16: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

<odesi>

• Will build on previous work • CRIC files are already in this format• Exposes undergraduates to the research

enterprise at an early stage in their careers • Is important in developing numeracy skills• Focus is on understanding what the data show,

not on the formulae• More will be available at Carleton by the next

term

Page 17: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

<odesi>

• Uses the Nesstar software• Metadata is first put into the publisher• Can be as rich as you want to make it• Can be exported as an XML file

– If something better comes along, the new software will read the file

• Collaboration through sharing the work

Page 18: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

click

• click

Page 19: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

click

• click

Page 20: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

click

• click

Page 21: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

click

• click

Page 22: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

click

• click

Page 23: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

click

• click

Page 24: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

click

• click

Page 25: A Journey in Data Discovery Wendy Watkins TSES 3001 30 October, 2007.

<odesi>

• Will allow data to be shared by all with access to the web

• Because of its format, evaporation will be avoided