Data security issues, ethical issues and challenges to privacy in knowledge-intensive learning and...

62
Data security issues, ethical issues and challenges to privacy in knowledge-intensive learning and its evaluation – Experiences from the LACE project Tore Hoel Oslo and Akershus University College of Applied Sciences Finnish-Norwegian Workshop in Learning Analytics, Helsinki, Finland 21 - 22 May 2015

Transcript of Data security issues, ethical issues and challenges to privacy in knowledge-intensive learning and...

1. Data security issues, ethical issues and challenges to privacy in knowledge-intensive learning and its evaluation Experiences from the LACE project Tore Hoel Oslo and Akershus University College of Applied Sciences Finnish-Norwegian Workshop in Learning Analytics, Helsinki, Finland 21 - 22 May 2015 2. 2 You are free to: copy, share, adapt, or re-mix; photograph, film, or broadcast; blog, live-blog, or post video of this presentation provided that: You attribute the work to its author and respect the rights and licences associated with its components. 3. Who am I? 3 Standards OER ICT in education EU & Nordic projects Learning Analytics InteroperabilityCommunication & Information Management 4. Who we are now (47+ Ass. Partners) 4 LACE Network LACE Consortium 5. LACE Policy Briefing Event 15 April 2015 LACE vision 5 Building bridges between research, policy and practice to realise the potential of learning analytics in EU. National differences Sectorial differences Cultural differences 6. Evidence Hub Map of Evidence 6 http://evidence.laceproject.eu/ 7. Capture, write, share, sense-make LACE website Evidence Hub Guidance and White Papers 8. 8 Its not about data security! Data security issues, ethical issues and challenges to privacy in knowledge-intensive learning and its evaluation. What are the issues? What is the context? What are the research questions? (not the concerns of the researchers) 9. Learning Analytics as The Silent Storm 9 10. 5 storms that going to change 10 Source: www.slideshare.net/Scobleizer/age-of-context-september-2014 11. 11 Source: www.slideshare.net/Scobleizer/age-of-context-september-2014 12. 12 Source: www.slideshare.net/Scobleizer/age-of-context-september-2014 13. 13Source: www.slideshare.net/Scobleizer/age-of-context-september-2014 14. 14Source: www.slideshare.net/Scobleizer/age-of-context-september-2014 15. 15 Source: www.slideshare.net/Scobleizer/age-of-context-september-2014 16. Natural Language Processing 18 17. Tools are already available 19 18. Access to a Variety of Data Eiger north face CC BY-SA 3.0 Terra3 en.wikipedia.org/wiki/Eiger#/media/File:North_face.jpg In the Race to the Top you need Data Subjects who see their Interest in Sharing 19. 21 The Guardian 12 March 2014 20. Who can we trust with our data? Usage data may be shared with any business partner for research or to allow them to share information about their products and services that may be of interest [to you]. your information may be used by us and by technology partners and course and content providers chosen by us. 22 21. Modernization of EU Universities report Recommendation 14 Member States should ensure that legal frameworks allow higher education institutions to collect and analyse learning data. The full and informed consent of students must be a requirement and the data should only be used for educational purposes. http://ec.europa.eu/education/library/reports/modernisatio n-universities_en.pdf Recommendation 15 Online platforms should inform users about their privacy and data protection policy in a clear and understandable way. Individuals should always have the choice to anonymise their data. 22. Data Sharing Its about Research and Scaling up learning analytics More useful analysis through combination of data from different sources Sufficient scale of data to determine relevance and quality of educational resources A critical mass of data for learning science research Reproducibility and transparency in LA research Cross-institutional strategy comparison Research on the effect of education policy Social learning in informal settings 26 Aims of Data Sharing: 23. Source: Mykola Pechenizkiy, TU Eindhoven - https://myweps.com/moodle/course/view.php?id=286 Four Major Types of Learning & Kinds of Questions EDM/LA Can Assist with How to (re)organize the classes, or assessment, or placement of materials based on usage and performance data How to identify those who would benefit from provided feedback, study advice or other help; How to decide which kind of help would be most effective? How to help learners in (re-) finding useful material, done whether individually or collaboratively with peers How to help learners in (re-) finding useful material, done whether individually or collaboratively with peers 29 24. Taxonomy for ethical, legal and logistical issues of learning analytics Published March 2015 by Niall Sclater, Jisc Input from Jisc, Apereo, and LACE project Groups of issues Ownership & Control Consent Transparency Privacy Validity Access Actions to be taken if Adverse Impact Stewardship 30 www.flickr.com/photos/dweickhoff/4762274448/in/pool-taxonomy 25. Ownership & Control 31 Overall responsibilit y Who in the institution is responsible for the appropriate and effective use of learning analytics? Control of data for analytics Who in the institution decides what data is collected and used for analytics? Breaking silos How can silos of data ownership be broken in order to obtain data for analytics? Control of analytics Who in the institution decides how analytics 26. Consent (1) 32 When to seek consent In which situations should students be asked for consent to collection and use of their data for analytics? Consent for anonymous use Should students be asked for consent for collection of data which will only be used in anonymised formats? Consent for outsourcing Do students need to give specific consent if the collection and analysis of data is to be outsourced to third parties? Clear and meaningful How can institutions avoid opaque privacy policies 27. Consent (2) 33 Right to anonymity Should students be allowed to disguise their identity in certain circumstances? Adverse impact of opting out on individual If a student is allowed to opt out of data collection and analysis could this have a negative impact on their academic progress? Adverse impact of opting out on group If individual students opt out will the dataset be incomplete, thus potentially reducing the accuracy and effectiveness of learning analytics for the group Lack of real choice to opt out Do students have a genuine choice if pressure is put on them by the institution or they feel their academic success may be impacted by opting out? 28. Consent (3) 34 Change of purpose Should institutions request consent again if the data is to be used for purposes for which consent was not originally given? Legitimate interest To what extent can the institutions legitimate interests override privacy controls for individuals? Unknown future uses of data How can consent be requested when potential future uses of the (big) data are not yet known? Consent in open courses Are open courses (MOOCs etc) different when it comes to obtaining consent? 29. Transparency 35 Student awareness of data collection What should students be told about the data that is being collected about them? Student awareness of data use What should students be told about the uses to which their data is being put? Student awareness of algorithms and metrics To what extent should students be given details of the algorithms used for learning analytics and the metrics and labels that are created? Proprietary 30. Privacy (1) 36 Out of scope data Is there any data that should not be used for learning analytics? Tracking location Under what circumstances is it appropriate to track the location of students? Staff permissions To what extent should access to students data be restricted within an institution? Unintentiona l creation of sensitive data How do institutions avoid creating sensitive data e.g. religion, ethnicity from other data? 31. Privacy (2) 37 Sharing data with other institutions Under what circumstances is it appropriate to share student data with other institutions? Access to employers Under what circumstances is it appropriate to give employers access to analytics on students? Enhancing trust by retaining data internally If students are told that their data will be kept within the institution will they develop greater trust in and acceptance of analytics? Use of metadata to identify individuals Can students be identified from metadata even if personal data has been deleted? Does anonymisation of data become more difficult 32. Validity 38 Minimisation of inaccurate data How should an institution minimise inaccuracies in the data? Minimisation of incomplete data How should an institution minimise incompleteness of the dataset? Optimum range of data sources How many and which data sources are necessary to ensure accuracy in the analytics? Validation of algorithms and metrics How should an institution validate its algorithms and metrics? Spurious correlations How can institutions avoid drawing misleading conclusions from spurious correlations? Evolving nature of How accurate can analytics be when students 33. Access 39 Student access to their data To what extent should students be able to access the data held about them? Student access to their analytics To what extent should students be able to access the analytics performed on their data? Data formats In what formats should students be able to access their data? Metrics and labels Should students see the metrics and labels attached to them? Right to correct inaccurate data What data should students be allowed to correct about themselves? 34. Action (1) 40 Institutional obligation to act What obligation does the institution have to intervene when there is evidence that a student could benefit from additional support? Student obligation to act What obligation do students have when analytics suggests actions to improve their academic progress? Conflict with study goals What should a student do if the suggestions are in conflict with their study goals? Obligation to prevent continuation What obligation does the institution have to prevent students from continuing on a pathway which analytics suggests is not advisable? Type of intervention How are the appropriate interventions decided on? 35. Action (2) 41 Staff incentives for intervention What incentives are in place for staff to change practices and facilitate intervention? Failure to act What happens if an institution fails to intervene when analytics suggests that it should? Need for human intermediatio n Are some analytics better presented to students via e.g. a tutor than a system? Triage How does an institution allocate resources for learning analytics appropriately for learners with different requirements? 36. Adverse Impact (1) 42 Labelling bias Does labelling or profiling of students bias institutional perceptions and behaviours towards them? Oversimplific ation How can institutions avoid overly simplistic metrics and decision making which ignore personal circumstances? Undermining of autonomy Is student autonomy in decision making undermined by predictive analytics? Gaming the system If students know that data is being collected about them will they alter their behaviour to present themselves more positively, thus distracting them and skewing the analytics? 37. Adverse Impact (2) 43 Reinforcement of discrimination Could analytics reinforce discriminatory attitudes and actions by profiling students based on their race or gender? Reinforcement of social power differentials Could analytics reinforce social power differentials and students status in relation to each other? Infantilisation Could analytics infantilise students by spoon-feeding them with automated suggestions, making the learning process less demanding? Echo Could analytics create echo chambers where intelligent software reinforces our own attitudes and 38. Stewardship 44 Data minimisation Is all the data held on an individual necessary in order to carry out the analytics? Data processing location Is the data being processed in a country permitted by the local data protection laws? Right to be forgotten Can all data regarding an individual (expect that necessary for statutory purposes) be deleted? Unnecessary data retention How long should data be retained for? Unhelpful data If data is deleted does this restrict the institutions analytics capabilities e.g. refining its models and 39. Summary of LACE EP4LA workshops Significant boundaries to combine different data sources Trade-off between LA benefits and privacy protection Concern about effectiveness of / limitations to privacy protection methods Transparency, accountability, and control are key Is the binary opt in/out model realistic? 45 40. Understanding Privacy 46 41. What is the problem with Privacy? 47 Medical Privacy Various methods have been used to protect patient's privacy. This 1822 drawing by Jacques-Pierre Maygnier shows a "compromise" procedure, in which the physician is kneeling before the woman but cannot see her genitalia. (Wikipedia) 42. Privacy about Limitation and Control? debate regarding privacy has swung between arguments for and against a particular approach with the limitation theory and control theory dominating (Heath, 2014) 48 43. Context Integrity: Framework to provide guidance to solve privacy related conflicts 50 Appropriateness Distribution (Nissenbaum, 2014) 44. Integrity respected or violated Contextual Integrity, is respected when norms of appropriateness and distribution are respected; it is violated when any of the norms are infringed. 51 45. 52 46. 53 47. What are the research questions? If Context Integrity is our approach to privacy and data protection in Education 54 48. Code of Practice 55 49. Policy on Ethical use of Student Data for Learning Analytics Published by Open University, UK, 2014 56 50. Principle 2: The OU has a responsibility to all stakeholders to use and extract meaning from student data for the benefit of students where feasible. Principle 1: Learning analytics is a moral practice, which should align with core organisational principles. 51. Principle 4: The purpose and the boundaries regarding the use of learning analytics should be well defined and visible. Principle 3: Students are not wholly defined by their visible data or our interpretation of that data. 52. Principle 6: Students should be engaged as active agents in the implementation of learning analytics (e.g., personalised learning paths, interventions, etc). Principle 5: The OU should aim to be transparent regarding data collection, and provide students with the opportunity to update their own data at regular intervals. 53. Principle 7: Modelling and interventions based on analysis of data should be sound and free from bias. Principle 8: Adoption of learning analytics within the OU requires broad acceptance of the values and benefits (organisational culture) and the development of appropriate skills across the organisation. 54. Privacy-by-Design 61 55. Privacy by Design 62 The principles of data protection by design and data protection by default (European Commission, 2012) 7 Foundational Principles by PbD Proactive not Reactive; Preventative not Remedial Privacy as the Default Setting Privacy Embedded into Design Full Functionality - Positive-Sum, not Zero-Sum End-to-End Security - Full Lifecycle Protection Visibility and Transparency - Keep it Open Respect for User Privacy - Keep it User-Centric 56. Architectures for LA 63 57. 64 An architecture for learning analytics 58. 65 An architecture for learning analytics How do we build trust mechanisms into our new tools? 59. 66 An architecture for learning analytics Which pedagogical scenarios are we able to accommodate in an Open LA Architecture? 60. Take part in our study of data sharing issues http://bit.ly/laissuesort 67 LACE website Evidence Hub Guidance and White Papers Browse, download, read & join LACE! 61. Not everything that can be counted counts. Not everything that counts can be counted. William Bruce Cameron 68 Photo (CC)-BY Paul Stainthorp https://www.flickr.com/photos/pstainthorp/5497004025 Not everything that can be counted counts. Not everything that counts can be counted. William Bruce Cameron 62. Hoel, T. (2015). Data security issues, ethical issues and challenges to privacy in knowledge-intensive learning and its evaluation Experiences from the LACE project presentation at Finnish-Norwegian Workshop in Learning Analytics, Helsinki, Finland, 22 May 2015 @tore about.me/torehoel [email protected] This work was undertaken as part of the LACE Project, supported by the European Commission Seventh Framework Programme, grant 619424. These slides are provided under the Creative Commons Attribution Licence: http://creativecommons.org/licenses/by/4.0/. Some images used may have different licence terms. www.laceproject.eu @laceproject 69