Your organization and Big Data: Managing, access, privacy, & security
-
Upload
louise-spiteri -
Category
Education
-
view
792 -
download
1
Transcript of Your organization and Big Data: Managing, access, privacy, & security
ARMA NS. November 26, 2015Presented by Louise Spiteri
Your organization and Big Data: Managing, access, privacy, & security
Defining Big Data
ARMA NS. Louise Spiteri
Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. http://www.gartner.com/it-glossary/big-data/
Big data is a term that describes large volumes of high velocity, complex, and variable data that require advanced techniques and technologies to enable the capture, storage, distribution, management, and analysis of the information. http://www.techamerica.org/Docs/fileManager.cfm?f=techamerica-bigdatareport-final.pdf
Defining Big Data
ARMA NS. Louise Spiteri
“Insights from Big Data can enable you to make better decisions. They can help you facilitate growth and organizational transformation, reduce costs and manage volatility and risk. This enables you to capitalize on new sources of revenue and generate more value for your organization.” Financial Accounting Advisory Services (n.d.). Big data strategy to support the CFO and governance agenda
The value of Big Data
ARMA NS. Louise Spiteri
The 4 Vs of Big Data
ARMA NS. Louise Spiteri
How much data does your organization generate?
ARMA NS. Louise Spiteri
Big Data tends to be measured in terms of terabytes and petabytes (1024 terabytes).
Definitions of “big” are relative, and fluctuate, especially as storage capacities increase over time.
Data is generated by every computerized system in the organization, including human resources solutions, supply-chain management software, and social media tools for marketing.
Volume
ARMA NS. Louise Spiteri
Google indexes 20 billion pages per day.
Twitter has more than 500 million users and 400 million tweets per day. Facebook generates 2.7 million‘Likes’, 500 TB processed, and 300 million photos that are uploaded per day.
http://bit.ly/1SVxPwp; http://bit.ly/1SVy76j; http://bloom.bg/1SVyldK
Examples of volume
ARMA NS. Louise Spiteri
What types of data do you collect & manage?
ARMA NS. Louise Spiteri
Organizations generate various types of structured, semi-structured, and unstructured data.
Structured data is the tabular type found in spreadsheets or relational databases (about 10% of most data).
Text, images, audio, and video are examples of unstructured data, which sometimes lacks the structural organization required by machines for analysis
Variety
ARMA NS. Louise Spiteri
How quickly does your data grow & change?
ARMA NS. Louise Spiteri
Velocity refers to the rate at which data is generated and the speed at which it should be analyzed and acted upon.
The proliferation of digital devices such as smartphones has led to an unprecedented rate of data creation and is driving a growing need for real-time analytics and evidence-based planning
Velocity
ARMA NS. Louise Spiteri
How accurate & reliable is your data?
ARMA NS. Louise Spiteri
Some data is inherently unreliable; for example, customer comments in social media, as they entail judgment.
We need to deal with imprecise and uncertain data. Is the data that is being stored, and mined meaningful to the problem being analyzed?
Veracity
ARMA NS. Louise Spiteri
Big Data is often characterized by relatively “low value density”. That is, the data received in the original form usually has a low value relative to its volume. However, a high value can be obtained by analyzing large volumes of such data.
Value
ARMA NS. Louise Spiteri
Value is any application of big data that:• Drives revenue increases (e.g. customer
loyalty analytics)• Identifies new revenue opportunities,
improves quality and customer satisfaction (e.g., Predictive Maintenance),
• Saves costs (e.g., fraud analytics)• Drives better outcomes (e.g., patient care).
Value
What Big Data looks like
ARMA NS. Louise Spiteri
Blogs, tweets, social networking sites (such as LinkedIn and Facebook), blogs, news feeds, discussion boards, and video sites all fall under Big Data.
Social media
ARMA NS. Louise Spiteri
Machine-generated data constitutes a wide variety of devices, from RFIDs to sensors, such as optical, acoustic, seismic, thermal, chemical, scientific, and medical devices, and even the weather.
Machine-generated data
ARMA NS. Louise Spiteri
From the GPS systems in our cars, in planes, and ships, to GPS apps on smartphones, we use GPS to guide our movements.
GPS is used to track our movements, such as emergency beacons, and retailers who use in-store WiFi networks to access shoppers’ smartphones and track their shopping habits.
Location Based Services (LBS) allow us to deliver services based on the location of moving objects such as cars or people with mobile phones.
GPS and spatial data
Mining Big Data
ARMA NS. Louise Spiteri
It is generally thought that the true value of Big Data is seen only when it is used to drive decision making.
You need efficient processes to turn high volumes of fast-moving and varied data into meaningful insights.
As information managers, you might not be doing the analysis, but you have a crucial role to play in managing this data to enable this analysis.
Big Data analytics: How do we mine our data?
ARMA NS. Louise Spiteri
Text analytics extract information from textual data.
• Social network feeds, emails, blogs, online forums, survey responses, corporate documents, news, and call centre logs are examples of textual data held by organizations.
Text analytics enable organizations to convert large volumes of human generated text into meaningful summaries, which support evidence-based decision-making.
Text analytics
ARMA NS. Louise Spiteri
Audio analytics analyze and extract information from unstructured audio data. Customer call centres and healthcare are the primary application areas of audio analytics.
• Call centres use audio analytics for efficient analysis of recorded calls to improve customer experience, evaluate agent performance, and so forth.
• In healthcare, audio analytics support diagnosis and treatment of certain medical conditions that affect the patient’s communication patterns (e.g.,schizophrenia), or analyze an infant’s cries to learn about the infant’s health and emotional status.
Audio analytics
ARMA NS. Louise Spiteri
Video analytics involves a variety of techniques to monitor, analyze, and extract meaningful information from video streams.
The increasing prevalence of closed-circuit television (CCTV) cameras and of video-sharing websites are the two leading contributors to the growth of computerized video analysis. A key challenge, however, is the sheer size of video data.
Video analytics
ARMA NS. Louise Spiteri
Social media analytics refer to the analysis of structured and unstructured data from social media channels.
• Social networks (e.g., Facebookand LinkedIn)• Blogs (e.g., Blogger and WordPress)• Microblogs (e.g.,Twitter and Tumblr)• Social news (e.g., Digg and Reddit)• Socia bookmarking (e.g., Delicious and StumbleUpon)• Media sharing (e.g., Instagram and YouTube)• Wikis (e.g., Wikipedia and Wikihow)• Question-and-answer sites (e.g., Yahoo! Answers and Ask.com) • Review sites (e.g., Yelp, TripAdvisor)
Social media analytics
ARMA NS. Louise Spiteri
Predictive analytics comprise a variety of techniques that predict future outcomes based on historical and current data, e.g., predicting customers’ travel plans based on what they buy, when they buy, and even what they say on social media.
Predictive analytics
Privacy and security of Big Data
ARMA NS. Louise Spiteri
• More data translates = higher risk of exposure in the event of a breach.
• More experimental usage = the organization's governance and security protocol is less likely to be in place
• New types of data are uncovering new privacy implications, with few privacy laws or guidelines to protect that information (e.g., cell phone beacons that broadcast physical location, & health devices such as medical, fitness and lifestyle trackers).
• Data linkage and combined sensitive data. The act of combining multiple data sources can create unanticipated sensitive data exposure.
Considerations for Big Data
ARMA NS. Louise Spiteri
“The protection of information and information systems from unauthorized access, use, disclosure, disruption, modification, or destruction in order to provide confidentiality, integrity, and availability.” National Institute of Standards and Technology
http://nvlpubs.nist.gov/nistpubs/ir/2013/NIST.IR.7298r2.pdf
Information security: Definition
ARMA NS. Louise Spiteri
“The claim of individuals, groups or institutions to determine for themselves when, how and to what extent, information about them is communicated to others.” International Association of Privacy Professionals
https://iapp.org/resources/privacy-glossary
Data privacy: Definition
ARMA NS. Louise Spiteri
Under the federal Personal Information Protection and Electronic Documents Act (PIPEDA), “personal information” is “information about an identifiable individual, but does not include the name, title or business address or telephone number of an employee of an organization.”
Regulatory framework for big data
ARMA NS. Louise Spiteri
The protection of personal information in Canada rests on three fundamental goals:.
• Transparency – providing people with a basic understanding of how their personal information will be used in order to gain informed consent
• Limiting use plus consent – the use of that information only for the declared purpose for which it was initially collected, or purposes consistent with that use; and,
• Minimization – limiting the personal information collected to what is directly relevant and necessary to accomplish the declared purpose and the discarding of the data once the original purpose has been served.
PIPEDA and big data
ARMA NS. Louise Spiteri
Organizations that attempt to implement Big Data initiatives without a strong governance regime in place, risk placing themselves in ethical dilemmas without set processes or guidelines to follow.
A strong ethical code, along with process, training, people, and metrics, is imperative to govern what organizations can do within a Big Data program.
Big Data governance
ARMA NS. Louise Spiteri
Data used for Big Data analytics can be gathered combined from different sources, and create new data sets.
Organizations must make sure that all security and privacy requirements that are applied to their original data sets are tracked and maintained across Big Data processes throughout the information life cycle, from data collection to disclosure or retention/destruction.
Respecting the original intent of the information gathered
ARMA NS. Louise Spiteri
Data that has been processed, enhanced, or changed by Big Data should be anonymized to protect the privacy of the original data source, such as customers or vendors.
Data that is not properly anonymized prior to external release (or in some cases, internal as well) may result in the compromise of data privacy, as the data is combined with previously collected, complex data sets.
Re-Identification
ARMA NS. Louise Spiteri
Matching data sets from third parties may provide valuable insights that could not be obtained with your data alone.
You need to consider and evaluate the adequacy of the security and privacy data protections in place at the third-party organizations.
Third-party use
ARMA NS. Louise Spiteri
Big data’s potential for predictive analysis raises particular concerns for data security and privacy.
• Think of the famous case of Target, which sent coupons to a teenage girl, based upon her shopping preferences, which suggested she was pregnant, as well as her due date (Target was accurate). The girl’s family found out about her pregnancy through these coupons.
• Did the girl know that her shopping information would be used for this purpose?
• Was she informed of Target’s privacy policy?
The risks of predictive analytics
ARMA NS. Louise Spiteri
There are growing concerns that Big Data is straining the privacy principles of identifying purposes and limited use.
Consumers are called upon to agree to privacy policies and consent forms that no one has the time to read. The burden is increasingly placed on the consumers, as these policies take the form of disclaimers for the orgnizations.
Increasing burden on the consumer
ARMA NS. Louise Spiteri
“Just because commercial organizations can collect personal information and run it through the revealing algorithms of predictive analytics, doesn’t mean that they should.” Jennifer Stoddardhttps://www.priv.gc.ca/media/sp-d/2013/sp-d_20131017_e.asp
Can we vs. should we?
ARMA NS. Louise Spiteri
A useful tool is the Privacy Maturity Model designed by the American Institute of Certified Public Accountants (AICPA) or the Canadian Institute of Chartered Accountants (CICA). These sections are particularly relevant:• 1.2.3: Personal Information Identification and classification• 1.2.4: Risk Assessment• 1.2.6: Infrastructure and Systems Management• 3.2.2: Consent for new Purposes and uses• 4.2.4: Information developed About Individuals• 8.2.1: Information security Program.• http://bit.ly/1SrCcih
Privacy assessment
ARMA NS. Louise Spiteri
Privacy Life cycle (from Maturity Model)
Information governance life cycle for Big Data
ARMA NS. Louise Spiteri
Strong data governance policies and procedures are important:• Who owns the data?• Who is responsible for protecting the
data?• How is data collected?• What data is collected?• How is the data retained?
Handling & retaining data
ARMA NS. Louise Spiteri
What security & privacy regulations apply to your data?
What are the compliance provisions of your agreements with any third parties or service providers. What are their privacy and security policies?
Developing a solid compliance framework with a risk-based map for implementation and maintenance.
Compliance
ARMA NS. Louise Spiteri
Develop case scenarios where you would use Big Data.
Identify what data will be used and how.
Identify possible risks
In this way, you are prepared for when you actually use the Big Data, rather than be in a position to react if something goes wrong.
Data use cases
ARMA NS. Louise Spiteri
Tell your customers what personal data you collect and how you use it.
Provide consistent consent mechanisms across all products
Ensure that customers have the means to withdraw their consent at the individual device level.
Manage consent
ARMA NS. Louise Spiteri
Have rigorous controls over who has access to the data.
Have periodic review of who has access rights, and ensure that rights are removed immediately, as and when required.
Access management
ARMA NS. Louise Spiteri
Remove all Personally IdentifiableInformation (PII) from a data set and turn it into non-identifying data.
Monitor anonymization requirements and analyze the risks of re-identification.
Anonymization
ARMA NS. Louise Spiteri
Maintain your responsibility to your customers when you share data with third parties.
Include specific Big Data provisions within contractual agreements.
Monitor third parties for compliance with data-sharing agreements.
Data sharing
Examples of data breaches
ARMA NS. Louise Spiteri
Information is BeautifulInteractive view of big data breaches
http://bit.ly/1SrCghQ
ARMA NS. Louise Spiteri
Big data breaches, 1
ARMA NS. Louise Spiteri
Big data breaches, 2
ARMA NS. Louise Spiteri
Internal Revenue Service (US)
• An unnamed source used an IRS app to download forms on 200,000 people.
• They were successful in downloading half this amount and used 15,000 of the forms to claim tax refunds in other people’s names.
Government breach, 1
ARMA NS. Louise Spiteri
Australian Immigration Department• An employee of the department
inadvertently sent passport, visa, and personal information of all the world leaders attending the Brisbane Summit to the organizers of the Asian Cup football tournament.
Government breach, 2
ARMA NS. Louise Spiteri
ARMA NS. Louise Spiteri
[email protected]• @Cleese6• LinkedIn: http://bit.ly/1SrCm9g• AboutMe: https://about.me/louisespiteri• ResearchGate: http://bit.ly/1SrCqWB• School of Information Management: www.dal.ca/sim
Contact information
ARMA NS. Louise Spiteri
http://www.looiconsulting.com/home/enterprise-big-data/
http://www.ibmbigdatahub.com/sites/default/files/infographic_file/4-Vs-of-big-data.jpg
http://www.kscpa.org/writable/files/AICPADocuments/10-229_aicpa_cica_privacy_maturity_model_finalebook.pdf
http://blog.templatemonster.com/2013/04/30/thank-you-pages-optimization/
Image sources