Post on 13-Jan-2022
Machine Learning for Manufacturing and Materials
Prof . Randy Paf fenro thA s s o c i a t e P r o f e s s o r o f
M a t h e m a t i c a l S c i e n c e s , C o m p u t e r S c i e n c e a n d D a t a S c i e n c e ,
W o r c e s t e r P o l y t e c h n i c I n s t i t u t e
Predictive Maintenance November 23, 2020
Students and collaborators!
Chong ZhouWenjing LiNitish Bahadur Kelum Gajamannage Rasika Karkare Matt Weiss
Louis Scharf
Anura Jayasumana
Les ServiPartha Pal
BBN/Raytheon
Josh UzarskiYingnan Liu
Patricia Medina
Robert Casoni
Lane Harrison Alex Wyglinski
http://www.azquotes.com/quote/850928
We are a machine learning research
group that focuses on problems in the
physical sciences
A selection of current applications
Chemical Sensors
Supported by The U.S. Army CCDC-SC
Supported by Nanocomp
Technologies
Nano-materialsCyber Warfare
Supported by BBN/Raytheon
and MITRE Corp
Manufacturing
Supported by The Advanced Casting
Research Center
A selection of current applications
Chemical Sensors
Supported by The U.S. Army CCDC-SC
Supported by Nanocomp
Technologies
Nano-materialsCyber Warfare
Supported by BBN/Raytheon
and MITRE Corp
Manufacturing
Supported by The Advanced Casting
Research Center
Students and collaborators!
Chong ZhouWenjing LiNitish Bahadur Kelum Gajamannage Matt Weiss
Louis Scharf
Anura Jayasumana
Les ServiPartha Pal
BBN/Raytheon
Josh UzarskiYingnan Liu
Patricia Medina
Robert Casoni
Lane Harrison Alex Wyglinski
Rasika Karkare
Root cause analysis of foundry defect formations drives appropriate corrective action for overall product quality enhancement
Porosity35 %
Other defects32 %
Supplier quality22 %
Tool costs/life11 %
Types of DefectsDepiction of Severe Internal Porosity
7
Source: NADCA & Ultraseal International.
There are many terms flying around these days.
https://sastat.org.za/sasa2017/big-data-dictionary
Data vs. ML approaches quadrant
9
Good Physical Model
Good Data
Bad Data
• Small size
• Unbalanced
• Biased
• Missing
• Irrelevant
Features
• Anomalous
Bad Physical Model
Source: Aref et.al., Clinical applications of machine learning in cardiovascular disease and its relevance to cardiac imaging
Source: Barata, Using 3D visualizations to tune hyperparameters in ML models
• Balanced data
• Large size
• Unbiased
• Complete
• Noise-free
• Relevant Features
Full Physics PDE model
Data vs. ML approaches quadrant
10
Good Data
Bad Data
• Small size
• Unbalanced
• Biased
• Missing
• Irrelevant
Features
• Anomalous
Bad Physical Model
Source: Aref et.al., Clinical applications of machine learning in cardiovascular disease and its relevance to cardiac imaging
Source: Barata, Using 3D visualizations to tune hyperparameters in ML models
• Balanced data
• Large size
• Unbiased
• Complete
• Noise-free
• Relevant Features
Full Physics PDE model
Good Physical Model
Deep learning vs Machine Learning4
Results – Comparison with RF and XGB12
Criteria for good DL approaches
Source: Jason Brownlee, How touse Learning Curves to Diagnose Machine Learning Model Performance
13
Bias-Variance Tradeoff
Model choice based on the size of the dataset
Underfit Robust Overfit
Challenges in data collection
14
Semi-Supervised Unbalanced and small-size Heterogeneous
Siloed Multi-modal data
Key idea: Need to work together!
15
A s s o c i a t e P r o f e s s o r o f M a t h e m a t i c a l S c i e n c e s , C o m p u t e r S c i e n c e a n d D a t a S c i e n c e
Professor Diran ApelianMetal Processing InstituteDirector, Advanced Casting Research Center (ACRC)
Going “into the weeds”…
16
Challenges in data collection
17
Semi-Supervised Unbalanced and small-size Heterogeneous
Siloed Multi-modal data
Dealing with missing and noisy data in manufacturing processes
18
f1 f2 f3 f4 f5 f6 f7 f8 f9 f10 f11 f12 f13
r1
r3
r3
r4
r5
r6
r7
r8
r9
Noise as an item
Noise as a feature
Noise as a record
Such algorithms exist!
19
Input X
N * m
Hidden
LayerN * k
Reconstruction
N * m
Cost1
Outlier Filter SN * m
Cost2
Wm * k
Wk * m
T
There is hope!Robust Hadamard Autoencoders
Karkare et.al, Blind Image Denoising and inpainting using Robust Hadamard Autoencoders, in progress
Standard Autoencoder(sae)-tsne – Fully Observed Data Projection
20
Hadamard Autoencoder(ha)-tsne20% Missing Data
21
Ha-tsne40% Missing Data
22
Ha-tsne60% Missing Data
23
Conclusions
• Machine learning is a powerful tool for predictive analytics• But, like any tool, it must be used properly
• Manufacturing data is different than the types of data that machine learning is used on• Semi-supervised
• Unbalanced
• Heterogenous
• However, when the correct algorithms are selected, machine learning can be used to solve difficult problems.
24
Acknowledgements
25