The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and...

21
The Materials Commons A Novel Information Repository and Collaboration Platform for the Materials Community Staff: Glenn Tarcea, Sravya Tamma, Domain Scientists: Brian Puchala, Emmanuelle Marquis and John Allison Information Scientists: Margaret Hedstrom (SI) and H. Jagadish (CSE) The University of Michigan

Transcript of The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and...

Page 1: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

TheMaterialsCommonsANovelInformationRepositoryandCollaboration

PlatformfortheMaterialsCommunity

Staff:GlennTarcea,Sravya Tamma,DomainScientists:BrianPuchala,EmmanuelleMarquis

andJohnAllisonInformationScientists:MargaretHedstrom (SI)andH.

Jagadish (CSE)TheUniversityofMichigan

Page 2: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

PRISMS DOESoftwareInnovationCenter

LeadingEdgeMetalsScience

PRISMSComputational

Tools

TheMaterialsCommons

Center for PRedictiveIntegrated Structural Materials

Science (PRISMS)

• A5-yeargranttoacceleratethedevelopmentofpredictivematerialsscience

• Involves:- 11faculty- 5staffscientists- 16students&postdocs

• DOEresourcesleveragedbysignificantUMcostshare

Page 3: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

PRISMS Overarching VisionEnable accelerated predictive materials science.

• Collaboration• Experimental &

Simulation Information• Seamless, Continuous

Workflow• Provenance Tracking• Accelerate model

building and validation

PRISMSComputational

Tools

TheMaterialsCommons

LeadingEdgeMetalsScience

Page 4: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

The Materials Commons

TheMaterialsCommonsconsistsof:– 2fulltimeprofessionalstaff– >25PRISMSfacultyandgradstudentsasusers.– A390TBIsilon storageclusterdatarepository– Awebsiteforuploadinganddownloadingdata,addingprovenance,

sharingandsearchingfordata– Anapplicationinstalledonyourcomputerforuploadingand

downloadingdata– ARESTbasedAPItoaccessandextendthecapabilitiesoftherepository

• InteractingwithNIST,ICE,CHiMaDetal,(andyou?) tosharebestpractices,schemas,etc.

Page 5: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

The Materials Commons - Workflow

SCIENTIFICWORKFLOW

STANDARDIZE CURATE COLLABORATE

PROVENANCE SHARE

CLOUDSERVICES

NIST/MDCS

MaterialsDataFacility

EXPERIMENTS COMPUTATIONSEXTERNALSERVICES

ORGANIZE

PUBLISH

DataCura>onDataStorageDataMiningDataProcessing

MaterialsCommons

Page 6: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

The Materials Commons: https://materialscommons.org

Page 7: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

The Materials Commons - Interface

Center for PRedictive Integrated Structural Materials Science

Materials&Commons&–&Online&User&Guide&

OnBLine&User&Guide&(currently&in&dra<&form)&helps&walk&new&users&through&the&process&

The Materials Commons -Interface

Page 8: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

The Materials Commons - Collaboration

Center for PRedictive Integrated Structural Materials Science

Materials&Commons&–&Use&Case&Collabora1on&

Tensile&useBcase&

demonstrates&

(1)  Integra1on&of&

experimental&data&

including&stressB

strain&curves,&

microstructure&

measurements&and&

DIC&data&to&CPFE&

simula1ons&

(2)  Use&of&MC&as&a&

collabora1ve&

plaYorm&within&the&

PRISMS&

experimental&and&

simula1on&group,&&

(3)  Interface&for&

passing&informa1on&

between&CPFE&code&

and&other&codes&

(DD,&DFT)&using&the&

JSON&data&format.&

Teams&work&in&projects&

•  They&share&files&through&the&website&

•  Reviews&provide&a&mechanism&to&get&input&

and&track&responses&

•  Notes&are&an&addi1onal&mechanism&for&

knowledge&sharing&and&capture&

The Materials Commons - Collaboration

Page 9: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

Materials Commons Data Model

• Sample:Representationsofrealorvirtualthings– materials,experimentalorvirtualspecimens,etc.

• SampleAttributes:Measuredattributesofasample– grainsize,weight,modulus, concentration,diffusivity, etc.

• Measurements: Ameasuredvalueofasampleattribute.– maybeoneormanypersampleattribute

• Process: Representationsofactions– APT,SEM,FatigueTest,MonteCarloCalculation,Continuum Plasticity

Calculation,etc.– specificesifaprocesstransformsasampleattribute– templatesspecifywhatprovenance informationmust/should/canbegiven

Page 10: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

Materials Commons Data Model

A

EC DB

Sec$oning

HeatTreatmentt=10hrHeatTreatmentt=1hr

B’ D’ E’

Analysis

TEM HardnessTestHardnessTestHardnessTest

TEM HardnessTestHardnessTestHardnessTest

C’

A

B

B’

C

C’

Composi$on(asreceived)

Hardness(asreceived)

Hardness(aAer1hr)

Ppt.Size(aAer1hr)

HeatTreatmentt=1hr

Hardness(asreceived)

Hardness(aAer1hr)

Ppt.Size(aAer1hr)

DatafileSample

A/ributeProcess

Measurement

(a)

(b)

(c)

Asreceived

Ppt.Size(aAer10hr)

Hardness(aAer10hr)

D

D'

E

E'

Composi$on(asreceived)

Exampleexperiment:

Ppt.size&hardness=f(heattreatment)

• Recievematerial• Sectionintospecimens• Measureppt.sizebyTEM• Measurehardnessbyindentation tests• Analyzetheresults,fitthemtoa

model,andcreateplots.

Page 11: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

Materials Commons Data Model

A

EC DB

Sec$oning

HeatTreatmentt=10hrHeatTreatmentt=1hr

B’ D’ E’

Analysis

TEM HardnessTestHardnessTestHardnessTest

TEM HardnessTestHardnessTestHardnessTest

C’

A

B

B’

C

C’

Composi$on(asreceived)

Hardness(asreceived)

Hardness(aAer1hr)

Ppt.Size(aAer1hr)

HeatTreatmentt=1hr

Hardness(asreceived)

Hardness(aAer1hr)

Ppt.Size(aAer1hr)

DatafileSample

A/ributeProcess

Measurement

(a)

(b)

(c)

Asreceived

Ppt.Size(aAer10hr)

Hardness(aAer10hr)

D

D'

E

E'

Composi$on(asreceived)

Scientificknowledge isencoded intheprocessdata:

• Sectioningcreatesnewsamples,butthecreatedsampleshavethesamecomposition andhardness.

• Measurementsononesample(A,B,C,D,orE)canbeapplied toallsamples.

Sectioning

Hardness(asreceived)

Composition(asreceived)

Page 12: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

Materials Commons Data Model

A

EC DB

Sec$oning

HeatTreatmentt=10hrHeatTreatmentt=1hr

B’ D’ E’

Analysis

TEM HardnessTestHardnessTestHardnessTest

TEM HardnessTestHardnessTestHardnessTest

C’

A

B

B’

C

C’

Composi$on(asreceived)

Hardness(asreceived)

Hardness(aAer1hr)

Ppt.Size(aAer1hr)

HeatTreatmentt=1hr

Hardness(asreceived)

Hardness(aAer1hr)

Ppt.Size(aAer1hr)

DatafileSample

A/ributeProcess

Measurement

(a)

(b)

(c)

Asreceived

Ppt.Size(aAer10hr)

Hardness(aAer10hr)

D

D'

E

E'

Composi$on(asreceived)

Scientificknowledge isencoded intheprocessdata:

• Heattreatment(ex.B->B')doesnottransformthecomposition, butitdoestransformthehardnessandppt.size.

A

EC DB

Sec$oning

HeatTreatmentt=10hrHeatTreatmentt=1hr

B’ D’ E’

Analysis

TEM HardnessTestHardnessTestHardnessTest

TEM HardnessTestHardnessTestHardnessTest

C’

A

B

B’

C

C’

Composi$on(asreceived)

Hardness(asreceived)

Hardness(aAer1hr)

Ppt.Size(aAer1hr)

HeatTreatmentt=1hr

Hardness(asreceived)

Hardness(aAer1hr)

Ppt.Size(aAer1hr)

DatafileSample

A/ributeProcess

Measurement

(a)

(b)

(c)

Asreceived

Ppt.Size(aAer10hr)

Hardness(aAer10hr)

D

D'

E

E'

Composi$on(asreceived)

Page 13: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

Materials Commons Data Model

Attributesmaybesharedacrosssamples,someasurementsareimmediatelyapplied toallrelevantsamples.

• Composition:• sharedbyallsamples

• Hardness(asreceived)• sharedby{A,B,C,D,E}

• HardnessandPpt.size(after1hr):• sharedby{B',C'}

• HardnessandPpt.size(after10hr):• sharedby{D',E'}

A

EC DB

Sec$oning

HeatTreatmentt=10hrHeatTreatmentt=1hr

B’ D’ E’

Analysis

TEM HardnessTestHardnessTestHardnessTest

TEM HardnessTestHardnessTestHardnessTest

C’

A

B

B’

C

C’

Composi$on(asreceived)

Hardness(asreceived)

Hardness(aAer1hr)

Ppt.Size(aAer1hr)

HeatTreatmentt=1hr

Hardness(asreceived)

Hardness(aAer1hr)

Ppt.Size(aAer1hr)

DatafileSample

A/ributeProcess

Measurement

(a)

(b)

(c)

Asreceived

Ppt.Size(aAer10hr)

Hardness(aAer10hr)

D

D'

E

E'

Composi$on(asreceived)

Page 14: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

Provenance for Multi-scale ModelingInitialatomicconfiguration

DFTCalculation:• Exchange-correlationfunctional• Pseudopotential• Energycutoff• k-pointmesh• etc.

Totalenergy

RelaxedatomicconfigurationStressfreestrain

GrandcanonicalMonteCarlo:• Supercellsize• Convergencecriteria• etc.

FreeenergyPrimitivecrystalstructure

Phasefieldsimulation:• Temperature&time• Solvertype&parameters• Timestep• etc.

Initialmicrostructure

Finalmicrostructure

Ppt.anisotropy

Ppt.size

Clusterexpansion fitting:• Basisset• Crossvalidationmethod• Regressionmethod• Featureselectionmethod• etc.

Trainingconfigurations

CVscore

Clusterexpansion

Primitivecrystalstructure

Page 15: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

Share provenance

• Onceaprojectordatasetissharedwithyou,youmayincludeit'ssample,processes,etc.inanotherproject.

• Datamaybere-usedwhilemaintaingprovenance

DFT

DislocationDynamics

StatisticalMechanics

PhaseField

CrystalPlasticity

Page 16: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

Use Data

• Flexiblyvisualizeandanalyze• Fitconstitutivemodels• Fitprocessmodels• Provenancegraph->workflow->optimization:

– Multi-scaleintegration– Materialdesign– Modelreduction– Optimaldesignofexperimentsandcomputations

Page 17: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

MaterialsCommons– EntryPageThe Materials Commons

Upcoming Features

• Makepublicdatasets:• Persistentidentifiers• Searchandbrowsefordatasets• Give/getfeedback,usagestats• Download formatteddata• Re-usedatainnewprojects,maintaining

provenance

• New"Experiments"inteface:• Organize'task'or'todo'list• Documentasyougo• Provenancegraphisbuilt foryou• Readytomakepublicwhen

finished• PythonAPI

• Writescriptstoupload&downloaddataandprovenance

Page 18: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

Conclusions

• TheMaterialsCommonsisanovelinformationrepositoryandcollaborationplatformforthematerialscommunity– FocusonPRISMStechnicalemphasisareas

• Aimtobeaseamlesspartofthescientificworkflow– Structureddatastorage,withprovenance– Collaborative

• Enablesdatauseforscienceandengineering

Page 19: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access
Page 20: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

Materials Commons Technologies

• API/RESTServicesareallJSONbased• DataSyncismovingtoMsgPak,Curve,andZeroMQ• WebsiteusesSocket.IO forrealtimeupdates,REST/JSONforserviceaccess– Juststartedintegratingthisin

• JSONDocumentstoreonbackend(RethinkDB)• WebsitewrittenusingAngularJS• BackendisamixofPythonandGo• SomeErlang mixedin• RabbitMQ usedforpipelineprocessing

Page 21: The Materials Commons - Northwestern University...• Data Sync is moving to MsgPak, Curve, and ZeroMQ • Website uses Socket.IO for real time updates, REST/JSON for service access

ArchitectureBlockDiagram

NGINXSSL

ReverseProxy

Website

WebServiceUser

DataUpload

APIServices

RabbitMQ

DataServices

elasticsearchCluster

DatabaseCluster

DataLoaderServices

Scripts

SharedServiceAccess

HTTPS HTTP

JSON

GOB(MsgPak,Curve)