BY : LISSY VERMA SHRADDHA GUPTA. Data Collection ODK : Open Data Kit Demo Usher : Improving Data...
-
Upload
moris-sharp -
Category
Documents
-
view
220 -
download
3
Transcript of BY : LISSY VERMA SHRADDHA GUPTA. Data Collection ODK : Open Data Kit Demo Usher : Improving Data...
DATA COLLECTION AND
IMPROVING DATA QUALITY
BY : LISSY VERMASHRADDHA GUPTA
OUTLINE
Data Collection ODK : Open Data Kit Demo
Usher : Improving Data Quality Purpose Implementation Results
DATA COLLECTION
Data collection in developing areas is difficult.
None of existing tools suffice.
Based on need, new features are needed.
OPEN DATA KIT
ODK is a tool suite for collection and management of data on mobile phones.
The main objective is to provide open source tools.
OPEN DATA KIT
ODK COLLECT Collects Data
ODK AGGREGATE Store Data, view and export.
ODK MANAGE Remote Device Management
A QUICK DEMO
AMPATH
AMPATH deployed the ODK for data collection for medical purpose.
Deployment was found to be successful minimizing delays and improving lives of healthcare workers and other people.
Data Collection is Challenging
Expertise in form design
Double Entry : Costly
Data Cleaning
Past Work
Constraints
Combo-boxes.
Reduce Time
Automatically filled Leave-forms.
USHER: Improving Data Quality
ESCORTER : Guide towards correct
entries.
Question Ordering in form.
Greedy Information Gain
Dynamically Reorder Questions
Predict Errors to Re-ask.
Contextualized Error Likelihood Principle.
CURBSTONING
Concept : An unscrupulous door-to-door
surveyor Shirks Work, ask only important
questions.
Greedy Information Gain
Uniform Prior : Equal likely inputs
Training Set
Context – specific Model Required
Bayesian Learning
DATASETS
The patient dataset collected at a rural HIV/AIDS clinic at Tanzania.
Survey dataset, responses from 1986 poll about race and politics
Probabilistic Relation : Form Questions
Bayesian Network for the patient dataset
Question layout generated by the algorithm
Re-ask Questions
Approximates Double Entry
Uncertainty : High Entropy
Outliers
Data-entry Feedback
Usher Components And Data-flow
Error Modeling
Accurate Prediction Results
THANK YOU
SUPPLEMENTARY SLIDES
DATA COLLECTION : PROBLEMS
Due to digital divide between the developing and developed areas, it is very difficult to collect and use data in the developing regions.
The main problems being : Lack of reliable infrastructure,Proper connectivity, and,Inadequate expertise.
Currently available tools for data collection like Pedragon Forms, Nokia Data Gathering, Java-Rosa, RapidSMS etc. are difficult to deploy, hard to use, complicated to scale and rarely customizable.
OPEN DATA KIT
The Open Data Kit or simply ODK is a suite of tools for data collection that uses Google’s Android platform.
The main objectives of the technology are : Modularising and customising toolsUse of open interfaces and standardsLong time survival of tools.
The three components of ODK are:1. ODK Collect : collects data using Forms.2. ODK Aggregate : ready to deploy online repository to store, view and export collected data.3. ODK Build : enables users to generate forms.4. ODK Voice : maps Forms to sound snippets.5. ODK Clinic : mobile medical record system.6. ODK Manage : maintains database of all phones for remote device management7. ODK Validate : validates Form.Other tools being ODK Dropbox, ODK Rangefinder, ODK Tasks, ODK Listen and ODK Visualise.