Optimizing the Data Supply Chain for Data Science
-
Upload
vitalai -
Category
Data & Analytics
-
view
2.741 -
download
1
Transcript of Optimizing the Data Supply Chain for Data Science
![Page 1: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/1.jpg)
Optimizing theData Supply Chainfor Data Science
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Marc HadfieldCEO, Vital A.I.
![Page 2: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/2.jpg)
about: vital ai
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Software Applications:Artificial Intelligence, Machine Learning, Data Science.
Software Vendor & Consulting Services
![Page 3: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/3.jpg)
agenda
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
• Data Models • How A.I., Data Science, & Data Governance relate • Data Supply Chain & the Data Product • Problem: the “Telephone Game” across the DSC • Architecture Transition from Data Warehouse to DSC • Data Models and DSC; a Framework for Solutions • Examples • Collaboration & Visualization
note: general methodology, with some specific examples from Vital AI implementations.
![Page 4: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/4.jpg)
takeaways:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
• The Data Supply Chain is a supply chain to deliver Data Products
• Data Models can capture the implicit meaning of data (and that is the goal!)
• Data Models can help negotiate the implicit differences across the DSC
• Data Models offer a means to collaborate on data standards (meaning) across the DSC partners
![Page 5: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/5.jpg)
about data models:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Semantic Models
![Page 6: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/6.jpg)
big data:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
volume, velocity, variety, veracity
variety: data models“Product”: different meaning in Manufacturing vs Retail context
Healthcare, same entity: “Patient”, “InsuredPerson”, “BillableEntity”
![Page 7: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/7.jpg)
example:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Class: PersonProperty: birthday
Standardized Unique Global Identifier (URI) data type: date relationship with property: age allowed range of values (can’t be born in the future) typical (average/expected) value…(Birthdays in Wikipedia vs Customer Database)
![Page 8: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/8.jpg)
about: vital ai tech
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Vital AI Development Kit (VDK)VitalSigns — Data Modeling & Code Generation
VitalService — Common API for Databases, Machine Learning, Apache Spark, Data Transforms
![Page 9: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/9.jpg)
about: vital ai tech
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
VitalServiceQuery
ExecutableQuery
Query Generator
Common Query API:Relational DB (SQL) Graph DB (Sparql) Key/Value Store NOSQL DBDocument DBApache SparkHive (Hadoop) Predictive Models (a query for an unknown value)
Goal: Build A.I. applications across variety of infrastructure with consistent API & Models.
![Page 10: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/10.jpg)
example data:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Person:Recipient
Person:Sender Message
hasRecipient
hasSender
![Page 11: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/11.jpg)
example “MetaQL” query:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
GRAPH { value segments: ["mydata"] ARC { node_constraint { Message.class } constraint { "?person1 != ?person2" } ARC_AND { ARC { edge_constraint { Edge_hasSender.class } node_constraint {
Person.props().emailAddress.equalTo(“[email protected]") }
node_constraint { Person.class } node_provides { "person1 = URI" } } ARC { edge_constraint { Edge_hasRecipient.class } node_constraint { Person.class } node_provides { "person2 = URI" } } } } }
“Person” may have subtypes, like Student or Employee.
![Page 12: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/12.jpg)
a.i. and data quality
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
![Page 13: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/13.jpg)
data models & machine learning:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
using the meaning of classes and properties, automatically generate predictive models.
predictive models features:birthday, zip code, …
![Page 14: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/14.jpg)
data governance =defining the meaning of data = feature (pre)engineering
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
critical aspect of data science
![Page 15: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/15.jpg)
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Progression of Analytics:
![Page 16: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/16.jpg)
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
where a.i. happens
Progression of Analytics:
![Page 17: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/17.jpg)
Garbage In = Garbage Out
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
= Bad A.I.
data governance required for Good A.I.
![Page 18: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/18.jpg)
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
one more point ondata governance…
think outside the box(data warehouse)
![Page 19: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/19.jpg)
data governance: data in motion
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
vs.inside data warehouse
outside data warehouse
![Page 21: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/21.jpg)
supply chain:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
product
![Page 22: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/22.jpg)
data supply chain:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
data product
![Page 23: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/23.jpg)
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Retail Recommendations… Shipping/Logistics Optimization… Compliance, Auditing, Security, Fraud Detection…
data product:
![Page 24: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/24.jpg)
why data supply chain?
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Partner DW Your DW
"No matter who you are, most of the smartest people work for someone else.” — Bill Joy.
![Page 25: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/25.jpg)
why data supply chain?
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Partner DW Your DW
"No matter who you are, most of the smartest people data works for someone else.” — Bill Joy. (revised)
![Page 26: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/26.jpg)
data supply chain
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Partner DW
Your DW
why not ETL?
![Page 28: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/28.jpg)
Extract…
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
not quite as expected…
![Page 29: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/29.jpg)
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Transform…
a bit extreme…
![Page 31: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/31.jpg)
Clean…
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
a lot of manual effort…
![Page 34: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/34.jpg)
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
what goes wrong?
telephone game…
![Page 35: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/35.jpg)
You61 Broadway Suite 1105
New York, NY [email protected]
http://www.vital.ai
Partner
Model “A”
Model “B”
Implicit Model
![Page 36: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/36.jpg)
Resolution: Make explicit the implicit. Align Data Models.
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Reason:Implicit assumptions in the data. ETL can’t see the forest for the trees. (or it’s very difficult with missing assumptions)
![Page 37: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/37.jpg)
Example: Internet of Things
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Predictive Analytics
“Nest for Office Buildings” Office Tower with Building Management System (BMS) containing 100,000 monitored points (temperature, energy usage of chiller, fan speed, etc.) with significant missing data, errors, and noise. Reconciliation of data to produce predictive models to minimize energy usage. Rules for data correctness.
![Page 38: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/38.jpg)
Sensor Data Validation:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Source data had temperature values of “0” (zero) which meant either the temperature was 0 degrees or that the sensor had an error.Data Model “knows” that it’s rarely 0 degrees in July (far from the standard deviation), and that the temperature can be compared to weather data on a day in December for reasonableness. If Data Model also knows the maintenance schedule for the sensors, then it “knows” when to expect 0 error values and exclude them.
Missing Maintenance Assumptions. Fill in secondary (weather) data for validation.
![Page 39: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/39.jpg)
how did we get here?61 Broadway Suite 1105
New York, NY [email protected]
http://www.vital.ai
Architecture Review:a quick step back…
What is a Data Supply Chain architecture?
![Page 40: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/40.jpg)
“traditional” data warehouse:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
ETL within the organization.Data Governance across the organization.
DW
![Page 41: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/41.jpg)
tech co. “agile” data warehouse:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
storage
compute
HDFS
Spark
DataSetsJobs
Batch/StreamingBuild Predictive Models Realtime: Spark/Storm
hadoop cluster
![Page 42: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/42.jpg)
enterprise: data lake
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
storage
compute
HDFS
Spark
X(save $)
“Data Swamp”
![Page 43: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/43.jpg)
aside: Data Lake
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
better analogy: Scriptorium
library,manuscript copying, & book distribution.
but not as Pithy as “Lake”…
![Page 44: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/44.jpg)
tech co. microservices (micro-SOA):
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
storage
compute
service
“Composed” App
external: social data, weather API
independent clusters,local data expertise
optimize development processes, scale up.
![Page 45: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/45.jpg)
microservices example:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Amazon: product search uses 170 independent microservices
including services for predicting customer characteristics, getting product images, etc.
http://www.infoworld.com/article/2903144/application-development/how-to-succeed-with-microservices-architecture.html
Netflix similar architecture
![Page 46: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/46.jpg)
Data Supply Chain:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
storage
compute
service
Data Product
“ETL”
Owner “A” Owner “B”
optimize development processes, scale up.
independent clusters,local data, ownership
![Page 47: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/47.jpg)
Interaction Points:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Data Product
service
compute ETL
Owner “A” Owner “B”
![Page 48: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/48.jpg)
Data Lineage: Cloudera Navigator
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
…within a Data Warehouse
trace back jobs that produced every data field.
![Page 49: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/49.jpg)
Data Supply Chain with Provenance:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
include provenance data directly in imported dataset. use in rules to interpret the data.
entity-123 | hasSource | datasource-A entity-123 | name | “John Doe”
Data Warehouse B
![Page 50: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/50.jpg)
Interaction Points: Data Models
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Data Product
service
computeETL
Data Models: Gatekeepers & Transform
Owner “A” Owner “B”
![Page 51: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/51.jpg)
Data Supply Chain using Models:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
storage
compute
service
Data Product
ETL
Owner “A” Owner “B”Model Server
Data Models: focus of data governance
![Page 52: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/52.jpg)
Semantic Data Models:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Make explicit the meaning of data
Transformation and Validation Rules leverage the Model and Meaning.Such Rules may be packaged with the Model, and managed together.
Protect against implicit assumptions
![Page 53: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/53.jpg)
Example: Financial Services
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
A B C
Service Provider
Reconciliation of Corporate Structure across 1,000’s of organizations. Compliance Rules barring communication between “researchers” and “traders”.Rules to infer if “Mary” is a “researcher” or “trader”.Conflicting concepts of “Branch-Office”, “Direct-Report”, etc. across the Globe.
![Page 54: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/54.jpg)
Example: Hospital Group
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
A B C
Data Analytics
Reconciliation across Patient Records, Insurance, & Billing for Patient Predictive Analytics.Rules for identity: “same person”
![Page 55: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/55.jpg)
Data Models:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
OWL: Semantic Ontology Model (W3C Standard, Various Standards for Rules)
VitalSigns: Generate Codevalidation, transformation, …
VitalSigns: Versioning, Dependencies, Exchange, Storage, Change Management (Semantic “Diff”)
![Page 56: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/56.jpg)
Example: Personally Identifiable Information
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Data Governance determines that “Profession” and “ZipCode” cannot be used together. (Maybe a single “Dentist” in a small town…)
Within a single Data Warehouse we can bar these data elements from being combined. But:Microservice A provides value of “Profession” Microservice B provides value of “ZipCode” How to enforce that these two microservices cannot be combined?
![Page 57: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/57.jpg)
Example: Personally Identifiable Information
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Validation code enforcing data usage:
Person person123 = get_person_details(“entity-123”) // this call works: person123.profession = get-profession(person123)// this call blocks because of data model validation // person123 already has “profession” propertyperson123.zipcode = get-zipcode(person123)
![Page 58: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/58.jpg)
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
GatekeepersExternally Managed.Active not Passive, more like “code”.Defining what should exist, not cataloguing what exists.Can decide when to be tolerant or strict.
Semantic Data Models:
![Page 59: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/59.jpg)
Collaborative Conversations:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Infrastructure DevOps
Data Scientists
Business +Domain Experts
Developers SemanticModel
![Page 60: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/60.jpg)
Collaborative Conversations:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Business +Domain Experts
SemanticModel
Business +Domain Experts
SemanticModel
Partner A Partner B
Model Alignment
What Concepts to combine, not what Tables to combine (that comes later).
![Page 61: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/61.jpg)
Authoring Tool: OWL IDE Protege
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
![Page 62: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/62.jpg)
Visualization: Semantic Data
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
![Page 63: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/63.jpg)
Visualization: WebVOWL
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
![Page 64: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/64.jpg)
in conclusion, takeaways:
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
• The Data Supply Chain is a supply chain to deliver Data Products
• Data Models can capture the implicit meaning of data (and that is the goal!)
• Data Models can help negotiate the implicit differences across the DSC
• Data Models offer a means to collaborate on data standards (meaning) across the DSC partners
![Page 65: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/65.jpg)
Questions?
61 Broadway Suite 1105 New York, NY 10006
[email protected] http://www.vital.ai
Marc HadfieldCEO, Vital [email protected]
![Page 66: Optimizing the Data Supply Chain for Data Science](https://reader033.fdocuments.in/reader033/viewer/2022042907/587fc6b11a28ab3b158b61b5/html5/thumbnails/66.jpg)