International Workshop on Machine Learning for Materials...

27
International Workshop on Machine Learning for Materials Science Abstract Book

Transcript of International Workshop on Machine Learning for Materials...

Page 1: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

International Workshop on Machine Learning

for Materials Science

Abstract Book

Page 2: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

WORKSHOP SCHEDULEDAY 1: Wednesday 8 March 2017

09:00 Greetings and Introduction - Dr. Lasse Laurson – Aalto

09:15 Accurate Machine Learning Predictions for Materials PropertiesDr. Matthias Rupp Theory Department, Fritz-Haber-Institute, Germany

10:00 Analysing and Rationalising Molecular and Materials Databases Using Machine-LearningDr. Sandip De, COSMO, EPFL, Switzerland

10:30 COFFE BREAK

11:00 Learning Interactions from Microscopic ObservablesDr. Albert Bartok-PartayRutherford Appleton Laboratory, United Kingdom

11:45 Machine Learning for Structural Diversity in Amorphous CarbonDr. Volker DeringerUniversity of Cambridge, United Kingdom

12:15 LUNCH BREAK

13:30 Machine Learning meets Quantum ChemistryProf. Klaus-Robert MüllerInstitute of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, Germany

14:15 Efficient Bayesian Inference of Surface Adsorbate StructureDr. Milica Todorovic COMP, Aalto University

14:45 COFFEE BREAK

15:1518:00

POSTER SESSION

20:00- Workshop Dinner, Helsinki city center

Page 3: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

DAY 2: Thursday 9 March 2017

09:00 Machine learning and dissimilarity analysis predict new strategiesfor bottom-up nanomaterials assemblyDr. Daniel PackwoodiCeMS, Kyoto University, Japan

09:45 Minimum Energy Path Calculations with Gaussian ProcessesOlli-Pekka KoistinenDept. of Computer Science, Aalto University

10:15 COFFE BREAK

10:45 Materials Design on Three Fronts: Fundamental Theory, Automation, and Machine LearningDr. Rickard ArmientoLinköping University, Sweden

11:30 Dynamical Simulation of Infrared Spectra with Neural Network PotentialsMichael GasteggerUniversity of Vienna, Austria

12:00 CLOSING REMARKS

Page 4: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

WORKSHOPPRESENTATIONS

Page 5: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Accurate Machine Learning Predictions forMaterials Properties

Matthias Rupp

Fritz Haber Institute of the Max Plank Society, Berlin, Germany

email: [email protected]

Systematic computational study and design of materials requiresaccurate treatment on the atomic scale. Numerical approximations tothe many-electron problem exist, but their prohibitive computationalscaling severely limits their applicability. In high-throughputsettings, machine learning can significantly reduce overallcomputational cost by interpolating between reference calculations.Current studies indicate that this Ansatz could enable computationalsavings up to several orders of magnitude in some applications. I willgive a brief overview of this recent line of research, with emphasison kernel-based machine learning models; discuss challenges on theroad towards materials design, such as the role of derivatives and thechoice of representation; and present current work on a many-bodytensor representation for materials, with first results on ab initioformation enthalpies of platinum-group / transition-metal binaryalloys.

Page 6: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Analysing and Rationalising Molecular andMaterials Databases Using Machine-Learning

Sandip De and Michele Ceriotti

COSMO, Ecole Polytechnique Féderale de Lausanne, Switzerland

email: [email protected]

Computational materials design promises to greatly accelerate theprocess of discovering new or more performant materials. Severalcollaborative efforts are contributing to this goal by buildingdatabases of structures, containing between thousands and millionsof distinct hypothetical compounds, whose properties are computedby high-throughput electronic structure calculations. The complexityand sheer amount of information has made manual exploration,interpretation and maintenance of these databases a formidablechallenge, making it necessary to resort to automatic analysis tools.Here we will demonstrate how, starting from a measure of(dis)similarity between database items built from a combination oflocal environment descriptors [1,2], it is possible to applyhierarchical clustering algorithms, as well as dimensionalityreduction methods such as sketchmap [3], to analyse, classify andinterpret trends in molecular and materials databases, as well as todetect inconsistencies and errors [4]. Thanks to the agnostic andflexible nature of the underlying metric, we will show how ourframework can be applied transparently to different kinds of systemsranging from organic molecules and oligopeptides to inorganiccrystal structures as well as molecular crystals.

[1] A. P. Bartok, et al, Phys. Rev. B 88 054104 (2013)[2] Sandip De, et al, Phys. Chem. Chem. Phys. 18 13754 (2016)[3] G. A. Tribello, et al, Proc. Acad. Natl. Sci. U.S.A. 109 5196 (2012)[4] Sandip De, et al, Mapping and Classifying Molecules from a High-Throughput Structural Database, accepted in Journal of Cheminformatics

Page 7: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Learning Interactions from Microscopic Observables

Albert Bartok-Partay

Rutherford Appleton Laboratory, Science and Technology FacilitiesCouncil, United Kingdom

email: [email protected]

Two main ingredients are needed when Gaussian Processes are usedto fit of atomic interactions: kernel and data. In my talk I will discusskernels that are designed to compare atomic structures and showexamples from molecular and condensed matter systems. Thesekernels are used to define a set of interatomic potentials or models,and a Bayesian approach determines which is the most likely, basedon the data as evidence. Numerically solving the quantummechanical equations provides an abundance of microscopicalobservables that can be used as data in machine learning, but it is farfrom obvious where points should be placed in configuration spaceand how to extract optimal information from the calculations. I willpresent protocols we used to build databases of atomicconfigurations, which provide machine learning potentials withcontrollable accuracy.

Page 8: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Machine Learning for Structural Diversity inAmorphous Carbon

Volker Deringer and Gábor Csányi

Engineering Laboratory, University of Cambridge, United Kingdom

email: [email protected]

Machine-learning based interatomic potentials are increasingly usedfor accurate simulations of solid-state materials. This promises to beparticularly useful in the amorphous state, where large system sizes,long time scales, and high accuracy are required at the same time.Here I will present a Gaussian Approximation Potential (GAP)model we have recently developed for liquid and amorphous carbon.The system is challenging to describe-owing, for example, to thecoexistence of “sp”, “sp2”, and “sp3” carbon atoms; we thereforecombine two-, three-, and many-body structural descriptors, andcritically examine how accurately a finite-range potential for carbonphases can be fitted at all. The new GAP model is then validatedagainst several DFT-level and experimental benchmarks, includingthe density-dependent Young’s moduli and “sp3” concentration.Initial applications will be presented, including surface properties oftetrahedral amorphous carbon (ta-C) and a study of lower-densityamorphous forms.

Page 9: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Machine Learning Meets Quantum Chemistry

Klaus-Robert Müller

Institute of Software Engineering and Theoretical Computer Science,Technische Universität Berlin, Germany

email: [email protected]

Since a few years Machine Learning has broadened the modelingtoolbox for quantum chemistry. The talk will first remind theaudience of the main ingredients for applying machine learning.Then the importance of good representations for molecules andmaterials is discussed and finally novel deep neural networktechniques are introduced that are shown to provide excellent resultsfor predictions (a) across chemical compound space and (2) MDsimulations for single molecules. Most interestingly the heavilynonlinear model can be interpreted and chemical insight can beextracted from it.

References:

Rupp, M., et al., Physical Review Letters, 108, 253002 (2012) Snyder, J.C., et al., Physical review letters, 108, 058301 (2012)Hansen, K., et al., J. Chem. Theory Comput., 9, 3404 (2013)Schütt, K.T., et al., Physical Review B, 89, 205118 (2014)Chmiela, S., et al., arXiv preprint arXiv:1611.04678 (2016)Schütt, K.T., et al., Nature Communications, 8, 13890 (2017)

Page 10: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Efficient Bayesian Inference of Surface Adsorbate Structure

M. Todorović1, M.U. Gutmann2, D.R. Hermoso3, R. Pérez3,J. Corander4 and P. Rinke1

1 Aalto University, Finland 2 University of Edinburgh, United Kingdom3 Universidad Autonóma de Madrid, Spain 4 University of Oslo, Norway

email: [email protected]

The adsorption and self-organisation of molecules at inorganicsurfaces is central to many industrial processes from catalysis andcoatings, to organic electronics and solar cells. Computer simulationscan help identify interface morphology and functionality, butsampling many atomic configurations over large length scales isprohibitively costly. We combined Bayesian optimisation [1] withaccurate atomistic simulations in our efficient structure search toolBOSS, designed for intelligent probabilistic sampling of atomicconfigurations. The nearly parameter-free framework relies onGaussian processes (GPs) to construct a probable potential energysurface (PES), which is then iteratively refined by input of energydata points from selected configurations.

Figure 1: BOSS application to A) adsorption registry ofcoronene/Cu(110)-O p(2x1), and B) molecularorientation of fullerene/TiO2 (110).

Page 11: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

The BOSS framework was employed in up to six dimensions toidentify the optimal adsorption structures of large organic moleculeson functional oxide substrates. We report a dramatic speed-up inidentifying optimal configurations, compared to the traditionalchemical intuition technique, without significant loss of accuracy.Thanks to the clever Bayesian sampling scheme (balancingexploitation and exploration steps) and a streamlined "buildingblock" approach to molecular structure, even complex interfaceproblems can be solved for relevant global and local minimastructures.

[1] M. U. Gutmann and J. Corander, J. Mach. Learn. Res., 17, 1 (2016).

Page 12: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Machine learning and dissimilarity analysispredict new strategies for bottom-up

nanomaterials assembly

Daniel Packwood

Institute for Integrated Cell-Material Sciences (iCeMS), Kyoto University,and Japan Science and Technology Agency (PRESTO)

email: [email protected]

‘Bottom-up nanomaterials assembly’ refers to the formation ofnanometer-sized structures via self-assembly of molecule precursors.While bottom-up assembly can produce structures with remarkablyfine shape control, the utility of this technique is limited by thedifficulty of predicting a priori how the precursor molecules willassemble. In this presentation, I will introduce a new theoreticalmethodology for studying the self-assembly of organic precursormolecules on metal surfaces. Our method uses an Ising-like model,in which the energy function is constructed by applying machinelearning to a database of pairwise intermolecular interactions andtheir energies. An original Markov chain Monte Carlo method is thenused to find the equilibrium states of the model [1, 2]. Finally, I willdescribe a dissimilarity analysis technique which quantifies how thestructures formed by molecular self-assembly depend upon thechemical properties of the precursor molecule. Importantly, thisanalysis produces a graphical output which can identify rules forobtaining desired structures from the molecular self-assemblyprocess.

[1] D. Packwood, P. Han, T. Hitosugi. Nat. Commun. In press (DOI:10.1038/ncomms14463)[2] D. Packwood, P. Han, T. Hitosugi. Roy. Soc. Open Sci. 3, 2016, 150681 … and more in preparation!

Page 13: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Minimum Energy Path Calculations with GaussianProcesses

Olli-Pekka Koistinen

Aalto University, Espoo, Finland

email: [email protected]

The calculation of minimum energy paths for transitions such asatomic and/or spin rearrangements is an important task in manycontexts and can often be used to determine the mechanism and rateof transitions. An important challenge is to reduce the computationaleffort in such calculations, especially when ab initio or electrondensity functional calculations are used to evaluate the energy sincethey can require large computational effort. Gaussian processregression is used here to reduce significantly the number of energyevaluations needed to find minimum energy paths of atomicrearrangements. By using results of previous calculations toconstruct an approximate energy surface and then converge to theminimum energy path on that surface in each Gaussian processiteration, the number of energy evaluations is reduced significantlyas compared with regular nudged elastic band calculations. For a testproblem involving rearrangements of a heptamer island on a crystalsurface, the number of energy evaluations is reduced to less than afifth.

Page 14: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Materials Design on Three Fronts:Fundamental Theory, Automation, and Machine

Learning

Rickard Armiento

Linköping University, Sweeden

email: [email protected]

The design of new materials with specific properties is at the core ofour technological progress. I will present our ongoing effortsspanning over three connected topical areas: (i) fundamental theorydevelopment for improved exchange-correlation functionals indensity-functional theory (DFT); (ii) progress on software for high-level automation of materials property calculations and itsapplication to materials design; and (iii) the adoption of methodsfrom big data and machine learning for prediction, data mining, andvisual exploration in ways that greatly expand the reach and scope oftraditional methods. In particular, the talk will cover our recentmachine-learning model demonstrated to predict DFT-qualityformation energies when trained on 10k calculations, which hasfacilitated the prediction of 90 new stable elpasolite crystals. I willalso discuss the continuation of this work to general geometries andchemistries.

Page 15: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Dynamical Simulation of Infrared Spectra with Neural Network Potentials

Michael Gastegger

University of Vienna, Austria

email: [email protected]

The use of machine learning based techniques to efficiently predictmolecular properties is a promising new addition to the field ofcomputational chemistry. One of these strategies is the high-dimensional neural network (HDNN) approach [1,2]. This methodexhibits inherent fragmentation capabilities through the exploitationof chemical locality [3] and is therefore well suited for theconstruction of transferable predictive models for molecular systemsof varying size. We employ the HDNN scheme to the modeling ofmolecular potential energy surfaces. Via the use of molecular forcesin the fitting procedure and an adaptive sampling scheme, highlyaccurate HDNN potentials can be constructed from only a smallnumber of electronic structure reference data. The resultingpotentials retain the accuracy of the underlying reference methods,but molecular energies and forces can be evaluated at only a fractionof the original computational cost. This makes it possible to carry outaccurate molecular dynamic simulations of large molecular systemsover long timescales. In order to demonstrate the power of HDNNpotentials, we use them to model the infrared spectra of organicmolecules and clusters of varying size via molecular dynamicssimulations [4]. The resulting spectra show excellent agreement withthose obtained via experiment and electronic structure simulations.Since this technique allows for the simulation of molecular systemsusually not accessible with standard electronic structure methods andaccounts for vibrational anharmonicities and conformational effects,it might prove as a valuable tool in the interpretation of infraredspectra of e.g. biologically relevant systems.

[1] J. Behler, M. Parrinello, Phys. Rev. Lett. 98, 146401 (2007) [2] M. Gastegger, P. Marquetand, J. Chem. Theory Comput. 11, 2187 (2015)[3] M. Gastegger, et al, J. Chem. Phys. submitted (2016) [4] M. Gastegger, P. Marquetand, et al, in preparation

Page 16: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

POSTER PRESENTATIONS

Page 17: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Machine learning of quantum forces in amorphoussystems: challenges and possible solutions

Claudio Zeni

Physics Department, King's College London, United Kingdom

email: [email protected]

The use of a machine learning approach to the prediction ofinteratomic forces for molecular dynamics (MD) simulations hasgained support in the recent years. This technique promises very highaccuracy while being computationally much faster than quantumchemical (QM) calculations (e.g., based on DFT methods) [1], [2]. Inparticular, Gaussian Process (GP) regression has been shown to havegood prediction capabilities for force fields in homogeneous crystals,both with and without defects such as vacancies [3]. On the otherhand materials with no underlying point symmetries, such asamorphous semiconductors, remain still challenging at the presentmoment [4] essentially due to the much higher complexity of thephases space associated with these materials, which requirescorrespondingly larger training sets. One way to overcome thisobstacle can be efficient database searching [5], coupled with localGP training [6]. Efficient searches, in turn, require suitable distancemetrics with strong correlation between small distances andsimilarity of the associated forces in system configuration couples. Agood distance must also be invariant under absolute rotation of eitherconfiguration in each couple. Here we propose an efficient procedurebased on a suitable distance metric, and on the division of thesystem into independent subsets which are used to train localGaussian Processes, for which tests reveal significantly improvedspeed and accuracy over the standard local formulation. This opensthe way to improving the quality of the predicted forces in view ofQM-accurate MD simulations.[1] A. P. Bartók, et al, Phys. Rev. Lett. 104, 136403 (2010)[2] Li, et al, Phys. Rev. Lett. 114, 096405 (2015)[3] Glielmo A. et al. https://arxiv.org/abs/1611.03877 [4] Volker L. Deringer, Gabor Csanyi https://arxiv.org/abs/1611.03277[5] M.C. Shaughnessy, R.E. Jones, J. Chem. Theory Comput. 12, 664 (2015)[6] Zeni C. et al (in preparation)

Page 18: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

A Machine Learning heuristic to speed up searchesin Chemical Compound Space via the Best-First-

Search Algorithm.

J.L. Teunissen, F. De Vleeschouwer, F. De Proft

ALGC, Free University of Brussels, Belgium

email: [email protected]

Searches for optimal molecular compounds in a chemical compoundspace (CCS) often evaluate a large number of chemical structures viaadvanced computational methods. Paths that lead to well-performingmolecules are then further explored while worse performingmolecules are discarded. When during such optimization processes alot of structures are evaluated, a continuously increasing database isconstructed. Moreover, when molecular frameworks are tuned site bysite, as in the Best First Search algorithm (BFS), the differentstructures show large similarity. Hence, predictive analytics could beused to improve the efficiency of the molecular design algorithm.Properties of new structures can be predicted at forehand to decidewhich ones possibly have an optimal property value. Subsequently,only the selected structures are evaluated via the computationallymore expensive method.Two basic models are used to predict the new property values: anOrdinary Least Squares (OLS) model and a Kernel Ridge Regression(KRR) model. The predictive ability of a model depends strongly onthe nature of the property to predict. Some properties behave ratherlinear such that every chemical modification (substituent, functionalgroup or dopant) has a particular contribution to the target property.These cases are ideal for statistical linear models where everyfragment within the molecular structure can be given a certaincoefficient expressing its contribution towards the property. Otherproperties, however, can show both local and global behavior,depending on the presence or absence of certain functional groups.

Page 19: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Hence, the property cannot be expressed as a mathematical equationwith every chemical fragment contributing to a specific extent, and(intrinsically) non-linear machine learning methods are preferred.As a proof of principle, diamondoid property optimizations areperformed with different target properties, such as the HOMO-LUMO gap and the ionization potential. Best First Searches wereapplied while continuously evaluating to what degree the results canbe predicted. (See Figure 1)

Figure 1: Evolution of the Spearman correlation coefficient for 4 differentpredictive analytics. The x-axis indicates the different prediction points usedin the BFS procedure (for minimizing the IP of adamantane), inchronological order, with the number of training samples increasingcontinuously. Hence, also an increasing predictive ability can be observed.

Page 20: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

High-Throughput Screening of Extrinsic Point DefectProperties in Si and Ge: Database and Applications

M. Sluydts1, M. Pieters1, J. Vanhellemont2, V. Van Speybroeck1

and S. Cottenier13

1 Center for Molecular Modeling, 2 Department of Solid State Sciences;3 Department of Materials Science and Engineering, Ghent University

email: [email protected]

Due to the increased availability of computational resources, DFTcalculations that used to be time-consuming can now be performed inlarge numbers. Therefore, automated high-throughput screeningmethods have appeared, capable of generating extensive DFT-baseddatasets. Datasets of this size are difficult to obtain experimentally.Moreover, the level of control one has over a computed dataset islarger than for an experimental set. For instance, atom positions indefect complexes are imposed in a calculation, while in experimentone must live with the often-unknown geometry of the complex thatspontaneously appears. Examining these datasets allows for thediscovery of global trends as well as the identification of interestingcases which can serve as a starting point for further research. In the present work, we applied a high-throughput methodology tostudy dopant behavior in the prototype semiconductors Si and Ge.DFT-calculations were performed for 73 dopants from H to Rn(excluding the lanthanides) at 6 common positions in the Si and Gelattices, always with full geometry optimization. The lowest-enthalpypositions were identified and compared to experiment, providing ameans of validation. The dataset was then used to determine vacancytrapping enthalpies. By formulating criteria for a given application,the dopants that lead to optimal vacancy traps could be selected.Such knowledge is of direct relevance to industrial processes such asCzochralski growth, where suitable vacancy traps can suppress voidformation. This work illustrates how high-throughput methods can beused, not only to verify and extend data, but also to direct futureresearch efforts towards relevant applications.

[1] DOI: 10.1021/acs.chemmater.6b03368

Page 21: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Excitons and Optical Spectra of PhosphoreneNanoribbons

Zahara Nourbakhsh

Institute for Research in Fundamental Sciences

email: [email protected]

On the basis of many-body ab initio calculations, using the single-shot G0W0 method and Bethe- Salpeter equation, we studyphosphorene nanoribbons (PNRs) in the two typical zigzag andarmchair directions. The electronic structure, optical absorption,electron-hole (exciton) binding energy, exciton exchange splitting,and exciton wave functions are calculated for different sizes ofPNRs. The typically strong splitting between singlet and tripletexcitonic states make PNRs favorable systems for optoelectronicapplications. Quantum confinement occurs in both kinds of PNRs,and it is stronger in the zPNRs, which behave like quasi-zero-dimensional systems. Scaling laws are investigated for the size-dependent behaviors of PNRs. The first bright excitonic state inPNRs is explored in detail.

Page 22: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Mapping and Classifying Molecules from a High-Throughput Structural Database

Felix Musil

COSMO, Ecole Polytechnique Féderale de Lausanne, Switzerland

email: [email protected]

High-throughput computational materials design promises to greatlyaccelerate the process of discovering new materials and compounds,and of optimizing their properties. The large databases of structuresand properties that result from computational searches, as well as theagglomeration of data of heterogeneous provenance leads toconsiderable challenges when it comes to navigating the database,representing its structure at a glance, understanding structure-property relations, eliminating duplicates and identifying incon-sistencies. Here we present a case study, based on data set ofconformers of aminoacids and dipetides, of how machine-learningtechniques can help addressing these issues. We will exploit arecently developed strategy to define a metric between structures,and use it as the basis of both clustering and dimensionalityreduction techniques – showing how these can help reveal structure-property relations, identifying outliers and inconsistent structures,and rationalise how perturbations (e.g. binding of ions to themolecule) affect the stability of different conformers.

Page 23: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Predicting Properties of Cubic Perovskites UsingMachine Learning

Sten Haastrup

Center for Atomic-Scale Materials Design, DTU, Denmark

email: [email protected]

The perovskite class of compounds, with the general formula ABX3,have shown promise for solar cell applications, with efficienciesincreasing from about 4% in 2009 to 22% in 2016. Thisimprovement has increased interest in perovskite structures for otherapplications, including water splitting. Finding suitable materials forthese applications means knowing the electronic struture andproperties; in particular the band gaps and the positions of the bandedges, as well as the stability of each compound under operatingconditions. These properties can to a large extent be calculated withdensity functional theory (DFT), but getting accurate results ischallenging and time-consuming. The goal is therefore to usemachine learning to predict the properties of the compounds basedon the properties of the constituent atoms. Based on a screeningstudy of perovskites, the band gaps, band edges and heats offormation of 19000 perovskites are known[1], and these data alongwith a suitable compound descriptor are used to train supervisedlearning models. Using ridge regression, it is possible to accuratelypredict the heat of formation of the perovskites. Predicting thepresence of a band gap is harder, but using neural networks it ispossible to get a partial separation into gapped and gapless classes.

[1] cmr.fysik.dtu.dk/cubic_perovskites/cubic_perovskites.html

Page 24: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Machine Learning of Quantum Forces: buildingaccurate force fields for molecular dynamics

simulation via “covariant” kernels.

Aldo Glielmo

King’s College London, Physics Department

email: [email protected]

In recent years, the construction of data-driven force fields viaMachine Learning methods proved to be a promising route in orderto bridge the gap between accurate (but slow) quantum chemicalcalculations and fast (but often unreliable) classical interatomicpotentials [1,2,3]. I will present a new scheme [4] that accuratelypredicts forces as vector quantities, rather than sets of scalarcomponents, by Gaussian Process (GP) regression. This is based onmatrix-valued kernel functions, to which we impose that thepredicted force rotates with the target configuration and isindependent of any rotations applied to the configuration databaseentries. We show that such "covariant" GP kernels can be obtained byintegration over the elements of the rotation group SO(n).Remarkably, in specific cases the integration can be carried outanalytically and yields a conservative force field that can be recastinto a pair interaction form. The accuracy of our kernels in predictingquantum-mechanical forces in real materials is investigated by testson pure and defective Ni and Fe crystalline systems. I will furtherdiscuss how such learning algorithm can be used to build a measureof complexity of physical systems [5]. Indeed, this can be defined asthe number of canonically sampled configurations needed to achievelow generalization error with high probability.

[1] Behler, et al, Phys. Rev. Lett. 98, 146401 (2007)[2] Bartók, et al, Phys. Rev. Lett. 104, 136403 (2010)[3] Li, et al, Phys. Rev. Lett. 114, 096405 (2015)[4] Glielmo et al, under review, Phys. Rev. B. arxiv.org/abs/1611.03877 [5] Glielmo et al, in preparation

Page 25: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

A Reduced Localised Coulomb Descriptor for theGaussian Approximation Potential

James Barker, Johannes Bulin, and Jan Hamaekers

Fraunhofer Institute for Algorithms and Scientific Computing , Germany

email: [email protected]

We introduce a novel class of localised atomic environmentrepresentations, based upon the Coulomb matrix. By combiningthese functions with the Gaussian approximation potential approach,we present LC-GAP, a new system for generating atomic potentialsthrough machine learning. Tests on the QM7, QM7b, and GDB9biomolecular datasets demonstrate that potentials created with LC-GAP can successfully predict atomisation energies to chemicalaccuracy for molecules larger than those used for training, and can(in the case of QM7b) also be used to predict a range of other atomicproperties with accuracy in line with the recent literature. As thebest-performing representation has only linear dimensionality in thenumber of atoms in a local atomic environment, this represents animprovement both in prediction accuracy and computational costwhen considered against similar Coulomb matrix-based methods.

Page 26: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

Electronic and Thermal Properties of MonolayerDichalcogenide MoS2 : A First-Principles Approach

Amreen Bano, Pareeti Khare and N.K.Gaur

Department of Physics, Barkatullah University, India

email: [email protected]

Electronic band structure and effect of temperature on the thermalproperties of monolayer MoS2 have been investigated in thepresented work. The electronic structure calculations been performedusing plane wave pseudopotential method based on densityfunctional theory in the monolayer-MoS2, the band gap of 1.64 eVwas found to be direct at K-point. All temperature dependentcalculations were performed using First-Principles calculations basedon Quasi-Harmonic Approximation (QHA). Transport properties ofMoS2, a prototypical layered-transition dichalcogenide, have beencalculated through Projector-augmented waves (PAW) method asimplemented in Quantum Espresso software. At room temperature(300K), the values obtained for specific heat Cv is 61.12 J/K/mol,free energy F is 76.706KJ/mol and entropy is 31.68 J/K/mol. In ourstudy we have found that Cv follows T3 law at low temperatures andgradually turn almost linear as temperature increases. Also, we havefound that, entropy is sensitive to temperature. The thermal responseof free energy is also studied which shows a decrement with raisingtemperature. Confinement of bulk MoS2 in a 2D monolayer is a wayto engineer 3D nanoparticles having a direct band gap and highpotential transport properties

[1] Wang, Q. H., et al, Nat. Nanotechnol. 7, 699 (2012)[2] Chhowalla, M., et al, Nat. Chem. 5, 263 (2013)[3] Jariwala, D., et al, ACS Nano 8, 1102 (2014)[4] Mak, K. F., et al, Phys. Rev. Lett. 105, 136805 (2010).[5] Radisavljevic, B., et al, Nat. Nanotechnol. 6, 147 (2011)[6] Lopez-Sanchez, O., et al, Nat. Nanotechnol. 8, 497 (2013)[7] Yin, Z., et al, ACS Nano 6, 74 (2011)[8] P. Giannozzi, et al, J. Phys.:Condens.Matter 21, 395502 (2009)

Page 27: International Workshop on Machine Learning for Materials …asci.aalto.fi/en/midcom-serveattachmentguid-1e70252e0a3acdc025211… · Dr. Albert Bartok-Partay Rutherford Appleton Laboratory,

An ab initio Database of Magnetic Materials

Torbjörn Björkman

Åbo Akatemi, Turku, Finland

email: [email protected]

We present a first account of the buildup phase of a database ofmagnetic properties calculated from density functional theory.Preliminary results for a few standard ground state properties, suchas spin and orbital magnetic moments, are presented. More advancedderived properties to be highlighted include spin dampingcoefficients and Heisenberg interaction parameters.

Vacancies in Silicon Carbide

Joel Davidsson

Linköping Universitet, Linköping, Sweden

email: [email protected]

Silicon Carbide (SiC) is a large bandgap semiconductor, that is proneto contain point defects. In recent years, it has been suggested thatsome of these defects could be used as quantum bits. Since many ofthese defects are unknown, theoretical models are useful for theiridentification. Ideally, we would like to simulate all possible defectsthat can occur and calculate their photoluminesce spectra. Inpreparation of these high-throughput calculations, we present a studyof convergence specially for divacancies in 4H-SiC.