Privacy and Security in the VLDS. 2 Commonwealth Security Benefits (Intended) Confidence in the...
-
Upload
norah-flynn -
Category
Documents
-
view
214 -
download
0
Transcript of Privacy and Security in the VLDS. 2 Commonwealth Security Benefits (Intended) Confidence in the...
Privacy and Security in the VLDS
2
Commonwealth Security Benefits (Intended)
• Confidence in the integrity of the data and the systems processes• Assistance in compliance with laws and regulation involving
confidentiality• A secure environment in which to perform business activities of the
Commonwealth• Identification and protection of key business functions and services
in the event of disaster• Monitoring for intrusions and Network "attacks" on Commonwealth
systems
3
SEC 501-01: The Commonwealth’s IS Security Standard Chapters
• Risk Management• IT Contingency Planning• Information Systems Security• Logical Access Control• Data Protection• Facilities Security• Personnel Security• Threat Management• IT Asset Management
4
Government Data Collection and Dissemination Practices Act (selected items)
• § 2.2-3803. Administration of systems including personal information; Internet privacy policy; exceptions.
• A. Any agency maintaining an information system that includes personal information shall:– 1. Collect, maintain, use, and disseminate only that personal information permitted or
required by law to be so collected, maintained, used, or disseminated, or necessary to accomplish a proper purpose of the agency;
– 5. Make no dissemination to another system without (i) specifying requirements for security and usage including limitations on access thereto, and (ii) receiving reasonable assurances that those requirements and limitations will be observed.
– 6. Maintain a list of all persons or organizations having regular access to personal information in the information system;
– 7. Maintain for a period of three years or until such time as the personal information is purged, whichever is shorter, a complete and accurate record, including identity and purpose, of every access to any personal information in a system, including the identity of any persons or organizations not having regular access authority but excluding access by the personnel of the agency wherein data is put to service for the purpose for which it is obtained;
– 8. Take affirmative action to establish rules of conduct and inform each person involved in the design, development, operation, or maintenance of the system, or the collection or use of any personal information contained therein, about all the requirements of this chapter, the rules and procedures, including penalties for noncompliance, of the agency designed to assure compliance with such requirements;
5
Government Data Collection and Dissemination Practices Act
• § 2.2-3805. Dissemination of reports– Any agency maintaining an information system that disseminates statistical
reports or research findings based on personal information drawn from its system, or from other systems shall:
• 1. Make available to any data subject or group, without revealing trade secrets, methodology and materials necessary to validate statistical analysis, and
• 2. Make no materials available for independent analysis without guarantees that no personal information will be used in any way that might prejudice judgments about any data subject.
• § 2.2-3806. Rights of data subjects.– 2. Give notice to a data subject of the possible dissemination of part or all of this
information to another agency, nongovernmental organization or system not having regular access authority, and indicate the use for which it is intended, and the specific consequences for the individual, which are known to the agency, of providing or not providing the information.
6
Family Educational Rights and Privacy Act
(2008 Amendments to Regulations)
• State Consolidated Education Data Systems– …the Department has been working closely with SEAs to establish or
upgrade State data systems in order to manage information generated by assessments, and use the data to improve student academic achievement and close achievement gaps. Changes to § 99.35(b) make it possible for SEAs and other State educational authorities to implement K-16 accountability systems by redisclosing personally identifiable student information on behalf of LEAs and postsecondary institutions provided they have legal authority to audit or evaluate one another's education programs.
– Additionally, under FERPA, State educational authorities, such as SEAs and higher education commissions, may disclose education records in personally identifiable form, without consent, to contractors, consultants, and other parties to whom they have outsourced organizational services or functions, including evaluation of Federal or State supported education programs under § 99.35, provided that the State educational authority has direct control over that outside party.
7
Relevant SCHEV Language
• § 23-9.6:1. Duties of Council generally.– 9. Develop a uniform, comprehensive data information system designed
to gather all information necessary to the performance of the Council's duties. The system shall include information on admissions, enrollments, self-identified students with documented disabilities, personnel, programs, financing, space inventory, facilities and such other areas as the Council deems appropriate. When consistent with the Government Data Collection and Dissemination Practices Act, the Virginia Unemployment Compensation Act, and applicable federal law, the Council, acting solely or in partnership with the Virginia Department of Education or the Virginia Employment Commission, may contract with private entities to create de-identified student records for the purpose of assessing the performance of institutions and specific programs relative to the workforce needs of the Commonwealth. For the purposes of this section, "de-identified student records" means records in which all personally identifiable information has been removed.
8
Component Overview
Data
9
Data Request
New Data Request
Sha
ker
Rep
ortin
gW
orkf
low
Lexi
con
Por
tal
Provide a login option to a
registered user
Expose metadata to
user
Provide user facility to build ad-hoc queries
Validate the query for data/security
check
Pass the validation check?
NO
Send an email to approvers notifying
them of the new data request
YES
Provide approvers the facility to review user submitted information
for intent and accuracy
Request approved by approvers?
Send an email to user explaining
reasons for denial
Send an email to user notifying the query has been
approved with an ETA for the data
NO
YES
Save data and create file
Execute the query and store the data
internally
Submit the query to the Shaker
Notify the workflow that the query is complete
Provide a facility to the user to access and
download the data set
Send an email to the user notifying the
data is available for download
Data
10
Security Overview
Aggregated Data (Suppressed)
Aggregated Data (Non- Suppressed)
Unit Record Level Data
Account Management
PortalComponents
Anonymous
Named
Schools
Researchers
Agency Employees
System Admin
Data
11
Security
Lexicon
Auth Directory
Set User Account\Permission
Active Directory
Large File FTP
Reporting (QBT)
Shaker (DQE)
Authorization
Authorization
Authorization
Authorization
Portal
Authorization
Workflow
Authorization
Authentication
Authentication
Authentication
CommonwealthEmployees
University Research
Data
Data
AuthenticationAuthentication
AuthorizationAuthorization•Database•Table•Column
•Database•Table•Column
•Role Based•Permission•Role Based•Permission
•Viewing•Viewing
•Viewing•Editing•Viewing•Editing
•Suppressed Data•Non-Suppressed Data•Suppressed Data•Non-Suppressed Data
•Viewing•Viewing
12
Reporting: Record Level Linked DataData
WorkflowReport Creation1,2
(Ad Hoc interface)
QBT
LexiconLexicon Shell Database1,2
Shell Database1,2
Ad Hoc Metadata
Ad Hoc Metadata
Report Creation1,2
(Ad Hoc interface)
Query Results5,6
Query Results5,6
DOEDOE SCHEVSCHEV VECVEC
Approval
1.1. Instantiates the information contained in the Lexicon.
2.2. Contains dummy data.
1.1. Instantiates the information contained in the Lexicon.
2.2. Contains dummy data.
Source Data
1.Report link will display report with dummy data.
2.Report will have a button that will allow submission of report to workflow.
3.Distributed query engine generate queries to each of the source data systems and join the result sets .
4.Engine will interact with Lexicon.
5.Options for report display include a Logi Analysis Grid (depending on number of records returned.) or a link to download a file.
6.Access may be provided through Ad Hoc report portal.
1.Report link will display report with dummy data.
2.Report will have a button that will allow submission of report to workflow.
3.Distributed query engine generate queries to each of the source data systems and join the result sets .
4.Engine will interact with Lexicon.
5.Options for report display include a Logi Analysis Grid (depending on number of records returned.) or a link to download a file.
6.Access may be provided through Ad Hoc report portal.
Results
ShakerShakerShaker3,4
Lexicon – Shaker Process
DS 1
DS 2
DS 3
LexiconLexicon
Linking Control
Data Access ControlData Access Control
User Interface/ Portal/ LogiXMLUser Interface/ Portal/ LogiXML
Sub-Query OptimizationSub-Query Optimization Hashed ID MatrixHashed ID Matrix
Authorized Query
Query Results
Common IDs [deterministic] or Common Elements with appropriate Transforms, Matching Algorithms and Thresholds [probabilistic]
A linking engine process will update the Lexicon periodically to allow query building on known available matched data fields. No data is used in this process. Queries are built on the relationships between data fields in the Lexicon.
Workflow Manager
Sample Data
Query Building Process
(Pre-Authorization)?
14
Hash_ID GNDR RACE GRADYEAR SCH_ID LEA_ID GPA STUTYPE RECTYPE REPYEAR FICE LOCDOMI1a473bg9i4qz0o8ya6 1 6 0708 10034 0107 3.4 1 2 0708 001444 01072r128yc8m1ls3v5jq4 1 5 0708 0403 0107 3.1 1 2 0708 006622 01073c903zo7b1tb0s2lw9 2 2 0708 7894 0107 3.9 1 2 0708 003705 01077a870rk2q6fo5u3vu4 1 4 0708 0405 0107 3 1 2 0708 003705 01070b737go6n2ks1x9sf5 2 6 0708 7789 0107 3.8 1 2 0708 003746 01079f802jr2t5ia6u4mj3 2 2 0708 9878 0107 4 1 2 0708 003735 0107
Matched Hash ID Values
• The SLDS server will match records from different agencies using the Hash ID
•After records are matched, the SLDS server will delete the Hash ID values and replace them with randomly generated unique IDs.
April 19, 2023
Random_ID GNDR RACE GRADYEAR SCH_ID LEA_ID GPA STUTYPE RECTYPE REPYEAR FICE LOCDOMI31xc7t65iq 1 6 0708 10034 0107 3.4 1 2 0708 001444 010799mh9r43yt 1 5 0708 0403 0107 3.1 1 2 0708 006622 010729px4w80iz 2 2 0708 7894 0107 3.9 1 2 0708 003705 010741aa2u63cl 1 4 0708 0405 0107 3 1 2 0708 003705 010703bq5n88hs 2 6 0708 7789 0107 3.8 1 2 0708 003746 010775ew7y36kk 2 2 0708 9878 0107 4 1 2 0708 003735 0107
Possible Connection using Web Service – creates Web Services Data Source (Oracle) - enables application and data integration by turning external web service into an SQL data source, making external Web services appear as regular SQL tables. This table function represents the output of calling external web services and can be used in an SQL query.
Possible Connection using Homogeneous link between Oracle DBs – establish synonyms for global names of remote objects in the distributed system so that the Shaker can access them with the same syntax as local objects
Hash_ID GNDR RACE GRADYEAR SCH_ID LEA_ID GPA1a473bg9i4qz0o8ya6 1 6 0708 10034 0107 3.42r128yc8m1ls3v5jq4 1 5 0708 0403 0107 3.13c903zo7b1tb0s2lw9 2 2 0708 7894 0107 3.92e749xv8k3hd8r4qk1 2 1 0708 7894 0107 2.47a870rk2q6fo5u3vu4 1 4 0708 0405 0107 34u924rn7n9sq1b9uf7 2 6 0708 10033 0107 2.70b737go6n2ks1x9sf5 2 6 0708 7789 0107 3.89f802jr2t5ia6u4mj3 2 2 0708 9878 0107 4
DOE Data Fields
Hash_ID STUTYPE RECTYPE GNDR RACE REPYEAR FICE LOCDOMI1a473bg9i4qz0o8ya6 1 2 1 6 0708 001444 01072r128yc8m1ls3v5jq4 1 2 1 5 0708 006622 01073c903zo7b1tb0s2lw9 1 2 2 2 0708 003705 01077a870rk2q6fo5u3vu4 1 2 1 4 0708 003705 01070b737go6n2ks1x9sf5 1 2 2 6 0708 003746 01079f802jr2t5ia6u4mj3 1 2 2 2 0708 003735 0107
SCHEV Data Fields
Sub-query processing priority will be determined for each query to minimize unnecessary data transfer (e.g. not downloading unmatched records unless specifically requested) to optimize join performance – see Query Sub-Process Optimization
Possible Connection using Heterogeneous link using available Transparent Gateway or
Generic ODBC/OLE
Merging UR Data on Hashed-IDs Data
Add’l Data
Sources
15
Data Architecture
DS 1DS 1
LexiconLexicon
DS 1
SPs3
AggregateLinked Data
1.Contains DBs for Shaker, Ad Hoc metadata, logging, auditing, etc.
2.Database for Shaker process and that temporarily stores linked record level data. The temporary tables will be dropped after a set period of time.
3.For canned reports, Stored Procedures will be used for data querying and suppression.
1.Contains DBs for Shaker, Ad Hoc metadata, logging, auditing, etc.
2.Database for Shaker process and that temporarily stores linked record level data. The temporary tables will be dropped after a set period of time.
3.For canned reports, Stored Procedures will be used for data querying and suppression.
Shaker/ Deidentified Record Level
Data2
Shaker/ Deidentified Record Level
Data2
VITA (CESC)
Aggregate Linked
Reports
Aggregate Linked
Reports
Record Level
Query / Reports
Record Level
Query / Reports
Lexicon UI /
Admin
Lexicon UI /
Admin
ETL1
Metadata and
Security1
Metadata and
Security1Shell DBShell DBWorkflowWorkflow
WorkflowWorkflow
Data
DS 3DS 3DS 2DS 2
SLDS Portal
16
Security
• Authentication – COV AUTH
• Authorization– Role Based
• Anonymous User• Named User
– System Administrator– Agency Employee– Researcher
– Permissions• Workflow• Reports (Suppressed and
Non-Suppressed)• Query Building Tool• Lexicon• Data elements• User Account Management
• Data security enforced by/at ….– Portal
– Lexicon• Viewing• Editing
– Reports• Suppressed Data• Non-Suppressed Data
– Workflow
– Data• Database• Table• Column
Data
Questions?