Steve Grive Head of Information Technology Services Session 1 Technology framework overview.
Data standardization within a research information system framework - Steve Revucky
-
Upload
casrai -
Category
Government & Nonprofit
-
view
310 -
download
0
Transcript of Data standardization within a research information system framework - Steve Revucky
![Page 1: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/1.jpg)
CASRAI ReConnect 2015INTELLECTUAL PROPERTY AND SCIENCE
STEVE REVUCKY, PRE-SALES SOLUTIONS SPECIALISTMay 3, 2023
![Page 2: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/2.jpg)
FOLLOWING AN EXAMPLE• Should be easy, right?
![Page 3: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/3.jpg)
STANDARDIZATION IS THIS EASY
![Page 4: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/4.jpg)
YOU SAY POTATO…– Classifications and standardizations exist so that we can
all understand each other and streamline communication– Taxonomies and lexica designed by organizations such
as CASRAI, among others, allow researchers, funders, partners, government authorities, and others to understand each other
– But what about the balancing flexibility with standardization?
![Page 5: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/5.jpg)
CONVERIS integrates with internal and external systems
-VIVO, etc.
![Page 6: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/6.jpg)
Engagement in Industry Standards
CONVERIS
![Page 7: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/7.jpg)
CONFIGURATION• Workflow processes• Labels• Entities (create or adapt)• Roles and rights
![Page 8: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/8.jpg)
![Page 9: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/9.jpg)
Integrating Systems with Converis
![Page 10: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/10.jpg)
ETL Concept
• Extract data in a certain format (CSV, XML, JSON, etc.) from a source location
• Transform and apply business logic to data including aggregation, counting, concatenation, scripting, lookups, merging, push files, etc.
• Load data in a certain format (CSV, XML, JSON) to a destination location
Extract Transform Load
![Page 11: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/11.jpg)
ETL and Converis• General ETL (all output formats/steps allowed)
• Converis ETL (fixed output step)
Extract Transform Load
Extract Transform Load
Plugin is installed on your Converis serverIt needs to be installed on your workstation too
![Page 12: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/12.jpg)
Implementing IntegrationsRequirements documents covers three points:
File Handling− Format (*.csv)− Location (/dir/*)− Frequency (e.g. nightly)
Data/Field Mapping− “hrID” = “converisID”− “surname” = “lastName”
Business Logic− What records should be added?− What updates/changes to data can/should be made in Converis?− Bidirectional integration?
Sample Banner Req. Doc
![Page 13: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/13.jpg)
System architecture
Search EngineInstitutional Repositories
Internal Data Sources
Fin-system
HR-system
DatabasePostgreSQL
LoginServer
DSpaceFedora EPrints
External Data Sources
ScopusWoS PubMedORCID …
Apache Solr
Research AnalyticsPentaho
Kettle ETL
…
… Java Server Faces(JSF)
Mapping Engine
Business logic(EJB)
API
RESTWeb
services
OAI-PMH
CONVERISJava EE
GlassFish
Data Integration
CONVERIS is a JavaEE application following the typical JavaEE 3-tier-architecture with a modular design of user interface, business logic (i.e. functionality) and data management (i.e. data model)
![Page 14: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/14.jpg)
CUSTOMIZATION (WITH LIMITS)• XML templates• Choicegroup modification• Field formatting
![Page 15: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/15.jpg)
RESEARCH AREA CLASSIFICATIONS– Keyword classifications:
![Page 16: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/16.jpg)
AUTHOR DISAMBIGUATION
![Page 17: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/17.jpg)
TO THE WIDER WORLD
![Page 18: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/18.jpg)
FUTURE POSSIBILITIES AND PLANS• Any structured data can be ingested into Converis• Fields can be mapped to existing or new fields• Each implementation is customized, so potential
exists to follow CASRAI guidelines:– CRediT – Contributor roles taxonomy
• Canadian Common CV (CCV) coming soon – early 2016
![Page 19: Data standardization within a research information system framework - Steve Revucky](https://reader035.fdocuments.in/reader035/viewer/2022070600/58cea6091a28abb26e8b6185/html5/thumbnails/19.jpg)
Thank you
Steve RevuckyPre-Sales Solutions SpecialistIP & ScienceThomson ReutersPhiladelphia, PA
Tel: +1 215 823 [email protected]