Exploring problems of data mobility, sharing and reuse
description
Transcript of Exploring problems of data mobility, sharing and reuse
![Page 1: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/1.jpg)
Exploring problems of data mobility, sharing and reuse
Rob ProcterMark Hartswood, Stuart Anderson, Paul
Taylor, Lilian Blot
1
![Page 2: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/2.jpg)
Overview
• The eResearch vision.• Background to this study.• Earlier studies of data mobility, sharing and
re-use.• Fieldwork findings and implications.• Conclusions.
2
![Page 3: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/3.jpg)
The eResearch vision
• The eResearch vision promotes collaboration, interdisciplinary work and ‘reduced time to discovery’ as the keys to future scientific advances.
• Increased data sharing and re-use is seen as fundamental to the realisation of this vision.
3
![Page 4: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/4.jpg)
Background to this study
• eDiaMoND was a UK e-Science programme project to create a shared national archive of digital mammograms from the UK breast screening programme, and use it to support a range of activities, including training.
• A follow-on project (LEMI) developed a training tool in collaboration with clinicians.
• Its aim was to draw upon archive materials and use them in ‘live’ training situations.
4
![Page 5: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/5.jpg)
The UK National Breast Screening Programme
• Breast cancer is the most common cause of cancer in the UK.
• Screening by mammography (breast X-Rays) offered every three years to women between 50 and 70 years of age.
• Mammograms examined by trained readers for signs of abnormality.
• Abnormal cases are recalled for further tests at an assessment clinic.– 3-6% are recalled and about 0.3-0.6% are malignant.
5
![Page 6: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/6.jpg)
e-DiaMoND
eDiaMoND blueprint document, 2005
http://www.ediamond.ox.ac.uk/publications/blueprint-Final.pdf
Digital mammogram archive
LEMITraining
Screening tool Lesion Zoo
Research• Epidemiology• Image analysisPractice• Training• Remote reading
6
![Page 7: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/7.jpg)
eDiaMoND data sharing and re-use model
Data archiveOriginating context
Use contextData archive
Metadata
![Page 8: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/8.jpg)
Earlier studies of eDiaMoND• Jirotka, M. et al (2005) Collaboration and Trust in Healthcare Innovation:
The eDiaMoND Case Study. JCSCW– Problematised the idea of remote reading.– Understanding the circumstances of mammogram production and use
important for trust in the data.• Coopmans, C. (2006) Making Mammograms Mobile: Suggestions for a
Sociology of Data Mobility. Information, Communication and Society– Problematised the idea of data mobility.– “An understanding of mobility … does not only emphasize that transit
is an active achievement but also draws attention to the craft like nature of that achievement: the artful connecting of time, space, material and immaterial elements into a ‘mobility effect.’”
8
![Page 9: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/9.jpg)
Questions motivating this study
• How should we understand the relationship between data and its originating context?
• What happens when people actually engage with the data to do something purposeful?
9
![Page 10: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/10.jpg)
How should we understand the relationship between data and context?
• Berg and Goorman (1999) describe medical data as ‘entangled’ with the context of its production.
• Words like ‘disentangled’ seem to imply that data can somehow liberated from its context.
• Berg and Goorman argue that the more contexts data has to be usable in, the more work needed to disentangle it.
10
![Page 11: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/11.jpg)
Patient records and data structures
Rich
Heterogeneous
Redundant
Documenting and guiding practice
Implicit relations
Partial
Selected
Explicit relations
11
![Page 12: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/12.jpg)
Encounters with eDiaMoND data
• Problems emerging when encountering the data in relation to:– Application development.– Set selection.– Training.
• We will examine:– How problems were recognised, diagnosed and fixed.– Who was involved and what resources they needed.
12
![Page 13: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/13.jpg)
Example 1: Data correction work
• Couldn’t be done automatically: – Data not of sufficient
quality
• But enough data embedded in the digital artefacts that a skilled person could correct.
13
![Page 14: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/14.jpg)
Example 2: Selecting cases to include in training sets
14
![Page 15: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/15.jpg)
Uncovering omissions
15
![Page 16: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/16.jpg)
Example 3: Training
16
![Page 17: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/17.jpg)
Mentoring the trainee
17
![Page 18: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/18.jpg)
Findings: 1
• Use of the data led to different sorts of data ‘problem’ emerging, requiring different sorts of resources to diagnose and repair.
• We had to go back to source and make corrections, additions, sometimes change the data model.
• Making sense of data depends on some understanding of the context of production.
• It was difficult to predict a priori what contextual information to preserve and what to discard.
18
![Page 19: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/19.jpg)
Findings: 2
• Studies of data mobility focus on need for work to ‘disentangle’ or ‘decontextualise’ data, but making interpretation and use of data less dependent on the originating context is only a part contributor to mobility.
• While we carve out a ‘chunk of context’, we also throw away significant detail, and no longer have easy access to the full range of resources that we would usually depend upon for making sense of its contents.
19
![Page 20: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/20.jpg)
Implications
• Moving on from eDiaMoND data curation model:– Tacit assumption that data abstracted from a working
context can be treated as self-sufficient.
• Better access to originating contexts:– Interpretative practices attendant on data re-use involve
linking originating and use context by some other means than that provided by metadata.
• Ease of correcting and amending data in-situ:– Facilities need to be available at point of use, and not
separated out into different processes and activities.
20
![Page 21: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/21.jpg)
Conclusions: 1
• Achieving data mobility is less about making it independent of the context of production, and more about appropriately maintaining and carefully managing links to that context.
• We find that users continually (re)appraise data based on their understandings of practices associated with its production and abstraction.
• This is also shown in Zimmerman’s study of data reuse by ecologists, whereby the appropriateness of using third party datasets is gauged according to what ecologists know and understand about the specific phenomena and data collection practices.
21
![Page 22: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/22.jpg)
Conclusions: 2
• Zimmerman asked ecologists to report retrospectively how they selected data for reuse whereas, in our study, we examined actual occasions of data reuse.
• While agreeing that greater detail of data collection practices should be made available, we take the more radical step of recommending capture of richer representations of the originating context.
22
![Page 23: Exploring problems of data mobility, sharing and reuse](https://reader033.fdocuments.in/reader033/viewer/2022051416/5681454a550346895db21b6b/html5/thumbnails/23.jpg)
Conclusions: 3
• We need to move away from ideas of linear processes and static data sets towards thinking of data as more organic, ‘living’ artefacts in need of periodic amendment, repair, renewal and retirement.
• If we shift our focus to accommodate non-linear aspects of data collection and the dynamic character of ‘live’ data, then this opens various opportunities for a radical reconfiguration of a variety of data management practices.
• This reconfiguration of data management needs to be taken seriously if the benefits of increased data re-use and sharing envisaged by eResearch are going to be realised fully.
23