Importing and Exporting DataShop Data Slides current to DataShop version 4.1.8 Brett Leber...

Post on 17-Dec-2015

218 views 1 download

Transcript of Importing and Exporting DataShop Data Slides current to DataShop version 4.1.8 Brett Leber...

Importing and Exporting DataShop Data

http://pslcdatashop.orgSlides current to DataShop version 4.1.8

Brett LeberInteraction Designer

Is your data right for DataShop?

It might be if it…• was produced by an intelligent tutoring system• follows a student action, tutor response

sequence (untutored actions OK)• is primarily textual• encodes some notion of “steps”

What kind of data do you have?

Benefits of importing your data

DataShop offers:• Web-based visualization and analysis tools for

exploring your data• Secure storage and backup• A location on the web where anyone you want

can access your data• Web services for programmatic access

• Directly/Real-time– Some tutors are logging directly to the PSLC logging database– CTAT-based tutors (when configured correctly), can log to disk

or to the logging database over the internet• Indirectly

– Other tutors are logging to their own file formats or their own databases• These data require a conversion process• Many studies are in this category

How do I get data in?

XML vs. tab-delimited format

XML• Richer description than tab-

delimited– More fields– Problem start time– Problem description– Problem tutor flag

• More verbose • Requires some familiarity with

XML• Not especially readable

Tab-delimited• More concise• Can edit in Excel• More easily shareable• Less rich than XML

– Missing problem start time, description, and tutor flag

<context_message context_message_id="02CE3AE5-F6D5-9177-913F-C34730F1096C" name="START_PROBLEM"> <meta> <user_id>student01</user_id> <session_id>08xz013</session_id> <time>2010/02/22 06:43:47.002</time> <time_zone>US/Eastern</time_zone> </meta> <dataset> <name>Learn a Language Fall 2007</name> <level type="unit"> <name>Learning Logging</name> <problem><name>Translating Tech Talk</name></problem> </level> </dataset></context_message>

Tutor Message Format

<tool_message context_message_id ="02CE3AE5-F6D5-9177-913F-C34730F1096C"> <meta> <user_id>student01</user_id> <session_id>08xz013</session_id> <time>2010/02/22 06:45:48.014</time> <time_zone>US/Eastern</time_zone> </meta> <semantic_event transaction_id="B503948-9164-DD83-EBB2-1589FD38D435" name="ATTEMPT" /> <event_descriptor> <selection>_level0.VideoPlayerInstance1.sliderButtonName</selection> <selection type="media_file">mymovie.flv</selection> <selection type="clip_length">00:08:00.0</input> <action>cue</action> <input type="start_cue">00:04:34.8</input> <input type="stop_cue">00:05:42.2</input> </event_descriptor></tool_message>

Tutor Message Format

<tutor_message context_message_id ="02CE3AE5-F6D5-9177-913F-C34730F1096C"> <meta> <user_id>student01</user_id> <session_id>08xz013</session_id> <time>2010/02/22 06:43:56.367</time> <time_zone>US/Eastern</time_zone> </meta> <semantic_event transaction_id="B503948-9164-DD83-EBB2-1589FD38D435" name="RESULT" /> <event_descriptor> <selection>_level0.VideoPlayerInstance1.sliderButtonName</selection> <selection type="media_file">mymovie.flv</selection> <selection type="clip_length">00:08:00.0</input> <action>cue</action> <input type="start_cue">00:04:34.8</input> <input type="stop_cue">00:05:42.2</input> </event_descriptor> <action_evaluation>INCORRECT</action_evaluation> <tutor_advice>Your answer is not correct. Select only the portion of the video where the man it talking about his family.</tutor_advice> <skill> <name>family_words</name> <category>video_portion_selection</category> </skill></tutor_message>

Tutor Message Format

Same thing in tab-delimited

Anon Student Id

Session ID Time Time Zone

Student Response type

Tutor Response Type

student01 08xz013 2010/02/22 06:45:48.014

EST ATTEMPT RESULT

Level(Unit) Problem Name

Step Name Outcome Selection Action

Learning Logging

Translating Tech Talk

mymovie.flv cue

INCORRECT

mymovie.flv cue

And so on

Tools: XML vs. tab-delimited format

XML• Java Logging Library

– Log in XML to disk or to a logging server– http://pslcdatashop.org/about/libraries.html

• Flash Logging Library– Log to a logging server– http://ctat.pact.cs.cmu.edu/index.php?id=logging-fl

ash

• Build a tutor with CTAT without programming– Can log to disk or to a logging server– http://ctat.pact.cs.cmu.edu

• Convert to XML via your own program– Transform existing log data into valid Tutor

Message Format– Validate your XML with a tool we’ve created – http://pslcdatashop.web.cmu.edu/xmlvalidator.html

Tab-delimited• DataShop Import Tool

– Verify your import file with our Verification Tool

– http://pslcdatashop.web.cmu.edu/importverify.html

DocumentationFor XML:• Guide to the Tutor Message Format:

http://pslcdatashop.org/dtd/guide/

For tab-delimited format:• http://pslcdatashop.org/about/importverify.html

To learn about terminology:• http://pslcdatashop.org/help?page=terms

To learn about existing DataShop output formats:• http://pslcdatashop.org/help?page=export

Case Study: Chinese Writing Study Fall 2009

http://www.learnlab.org/research/wiki/index.php/Perfetti_-_Read_Write_Integration

• Researchers presented the DataShop team with their data, which was a tabular format unlike the DataShop format.

• DataShop team consulted with the research team to see which DataShop-required fields were missing and which new fields were extra.

• DataShop team and researchers arrived at definitions of problems, steps, and knowledge components.

• DataShop requires a correct/incorrect tagging of each attempt, so correctness was determined by a threshold (eg, 0.5)

• DataShop consultant (Alida) wrote a converter to convert from this tabular format to XML, and imported into DataShop.

Future of importing and the format

• Push-button import• Richer, more-flexible format• Multimedia (audio)• Dialogue data

Exporting from DataShop

• From the website:– By transaction– By student-step– By student-problem

• From web services:– By transaction– By student-step

Exporting from DataShop1. Log in to the web application.

2. Choose a dataset.

3. Click “Export” tab.

4. Choose a level of granularity (transaction, step, or problem).

5. Choose a sample.

6. Click export button.Tip: “All Data” sample is cached for transaction export, so choosing that sample results in fastest export.

Questions?