Discovering Phase Transitions with Unsupervised Learning · model, we now turn to a more...

Discovering Phase Transitions with Unsupervised Learning

Lei WangBeijing National Lab for Condensed Matter Physics and Institute of Physics,

Chinese Academy of Sciences, Beijing 100190, China

Unsupervised learning is a discipline of machine learning which aims at discovering patterns in bigdata sets or classifying the data into several categories without being trained explicitly. We showthat unsupervised learning techniques can be readily used to identify phases and phases transitionsof many body systems. Starting with raw spin configurations of a prototypical Ising model, we useprincipal component analysis to extract relevant low dimensional representations the original dataand use clustering analysis to identify distinct phases in the feature space. This approach successfullyfinds out physical concepts such as order parameter and structure factor to be indicators of the phasetransition. We discuss future prospects of discovering more complex phases and phase transitionsusing unsupervised learning techniques.

Classifying phases of matter and identifying phasetransitions between them is one of the central topics ofcondensed matter physics research. Despite an astronom-ical number of constituting particles, it often suffices torepresent states of a many-body system with only a fewvariables. For example, a conventional approach in con-densed matter physics is to identify order parameters viasymmetry consideration or analyzing low energy collec-tive degree of freedoms and use them to label phases ofmatter [1].

However, it is harder to identify phases and phase tran-sitions in this way in an increasing number of new statesof matter, where the order parameter may only be definedin an elusive nonlocal way [2]. These new developmentscall for new ways of identifying appropriate indicators ofphase transitions.

To meet this challenge, we use machine learning tech-niques to extract information of phases and phase tran-sitions directly from many-body configurations. In fact,application of machine learning techniques to condensedmatter physics is a burgeoning field [3–13][33]. For ex-ample, regression approaches are used to predict crystalstructures [3], to approximate density functionals [6], andto solve quantum impurity problems [10]; artificial neuralnetworks are trained to classify phases of classical statis-tical models [13]. However, most of those applicationsuse supervised learning techniques (regression and clas-sification), where a learner needs to be trained with thepreviously solved data set (input/output pairs) before itcan be used to make predictions.

On the other hand, in the unsupervised learning, thereis no such explicit training phase. The learner should byitself find out interesting patterns in the input data. Typ-ical unsupervised learning tasks include cluster analysisand feature extraction. Cluster analysis divides the inputdata into several groups based on certain measures of sim-ilarities. Feature extraction finds a low-dimensional rep-resentation of the dataset while still preserving essentialcharacteristics of the original data. Unsupervised learn-ing methods have broad applications in data compres-sion, visualization, online advertising and recommender

system, etc. They are often being used as a preprocessorof supervised learning to simplify the training procedure.In many cases, unsupervised learning also lead to betterhuman interpretations of complex datasets.

In this paper, we explore the application of unsuper-vised learning in many-body physics with a focus onphase transitions. The advantage of unsupervised learn-ing is that one assumes neither the presence of the phasetransition nor the precise location of the critical point.Dimension reduction techniques can extract salient fea-tures such as order parameter and structure factor fromthe raw configuration data. Clustering analysis can thendivide the data into several groups in the low-dimensionalfeature space, representing different phases. Our studiesshow that unsupervised learning techniques have greatpotentials of addressing the big data challenge in themany-body physics and making scientific discoveries.

As an example, we consider the prototypical classicalIsing model

H = −J∑〈i,j〉

σiσj , (1)

where the spins take two values σi = {−1,+1}. Weconsider the model (1) on a square lattice with periodicboundary conditions and set J = 1 as the energy unit.The system undergoes a phase transition at temperatureT/J = 2/ln(1 +

√2) ≈ 2.269 [14]. A discrete Z2 spin

inversion symmetry is broken in the ferromagnetic phasebelow Tc and is restored in the disordered phase at tem-peratures above Tc.

We generate 100 uncorrelated spin configuration sam-ples using Monte Carlo simulation [15] at temperaturesT/J = 1.6, 1.7, . . . , 2.9 each and collect them into anM ×N matrix

X =

↑ ↓ ↑ . . . ↑ ↑ ↑...

↓ ↑ ↓ . . . ↑ ↓ ↑

M×N

, (2)

where M = 1400 is the total number of samples, and Nis the number of lattice sites. The up and down arrows

arX

iv:1

606.

0031

8v2

[co

nd-m

at.s

tat-

mec

h] 6

Jun

201

6

2

1 2 3 4 5 6 7 8 9 10`

10-3

10-2

10-1

100

λ̃`

N=202

N=402

N=802

0 400 800 1200 1600i

0.00

0.01

0.02

0.03

0.04

0.05

w1

Figure 1: The first few explained variance ratios obtainedfrom the raw Ising configurations. The inset shows theweights of the first principal component on an N = 402 squarelattice.

in the matrix denote σi = ±1. Such a matrix is the onlydata we feed to the unsupervised learning algorithm.

Our goal is to discover possible phase transition of themodel (1) without assuming its existence. This is differ-ent from the supervised learning task, where exact knowl-edge of Tc was used to train a learner [13]. Moreover,the following analysis does not assume any prior knowl-edge about the lattice geometry and the Hamiltonian.We are going to use the unsupervised learning approachto extract salient features in the data and then use thisinformation to cluster the samples into distinct phases.Knowledge about the temperature of each sample andthe critical temperature Tc of the Ising model is used toverify the clustering.

Interpreting each row of X as a coordinate of an N -dimensional space, the M data points form a cloud cen-tered around the origin of a hypercube [34]. Discoveringa phase transition amounts to find a hypersurface whichdivides the data points into several groups, each repre-senting a phase. The task is akin to the standard unsu-pervised learning technique: cluster analysis [16], wherenumerous algorithms are available, and they group thedata based on different criteria.

However, direct applying clustering algorithms to theIsing configurations may not be very enlightening. Thereasons are twofold. First, even if one manages to sep-arate the data into several groups, clusters in high di-mensional space may not directly offer useful physicalinsights. Second, many clustering algorithms rely on agood measure of similarity between the data points. Itsdefinition is, however, ambiguous without supplying ofdomain knowledge such as the distance between two spinconfigurations.

On the other hand, the raw spin configuration is a

50

25

0

25

50

y 2

(a)

50

25

0

25

50

y 2

(b)

100 50 0 50 100y1

50

25

0

25

50

y 2

(c)

1.6

1.8

2.0

2.2

2.4

2.6

2.8

Figure 2: Projection of the samples onto the plane of theleading two principal components. The color bar on the rightindicates the temperature T/J of the samples. The panels(a-c) are for N = 202, 402 and 802 sites respectively.

highly redundant description of the system’s state be-cause there are correlations among the spins. Moreover,as the temperature varies, there is an overall tendencyin the raw spin configurations, such as lowering the totalmagnetization. In the following, we will try to first iden-tify some crucial features in the raw data. They providean effective low dimensional representation of the originaldata. And in terms of these features, the meaning of thedistance between configurations becomes more transpar-ent. The separation of phases is also often clearly visibleand comprehensible by the human in the reduced spacespanned by these features. Therefore, feature extractiondoes not only simplifies the subsequent clustering anal-ysis but also provides effective means of visualizing andoffering physical insights. We denote the crucial featuresextracted by the unsupervised learning as indicators ofthe phase transition. In general, they do not necessarilyneed to be the same as the conventional order parame-ters defined in condensed matter physics. This unsuper-vised learning approach nevertheless provides an alterna-tive view of phases and phase transitions.

Principal component analysis (PCA) [17] is a widelyused feature extraction technique. The principal compo-nents are mutually orthogonal directions along which thevariances of the data decrease monotonically. PCA findsthe principal components through a linearly transforma-tion of the original coordinates Y = XW . When appliedto the Ising configurations in Eq. (2), PCA finds the mostsignificant variations of the data changing with the tem-perature. We interpret them as relevant features in thedata and use them as indicators of the phase transitionif there is any.

We write the orthogonal transformation into columnvectors W = (w1, w2, . . . , wN ) and denote w` as weights

3

Figure 3: Typical configurations of the COP Ising model atbelow (a,b) and above (c) the critical temperature. Red andblue pixels indicate up and down spins. There are exactly halfof the pixels are red/blue due to the constraint

∑i σi ≡ 0.

of the principal components in the configuration space.They are determined by an eigenproblem [18] [35]

XTXw` = λ`w`. (3)

The eigenvalues are nonnegative real numbers sorted ina descending order λ1 ≥ λ2 . . . ≥ λN ≥ 0. Using theterminology of PCA, we denote the normalized eigenval-ues λ̃` = λ`/

∑N`=1 λ` as explained variance ratio. When

keeping only the first few principal components, PCA isan efficient dimension reduction approach which capturesmost variations of the original data. Moreover, PCA alsoyields an optimal approximation of the data in the senseof minimizing the squared reconstruction error [18].

Figure 1 shows the first few explained variance ratiosfor various system sizes. Notably, there is only one dom-inant principal component. As the temperature changesthe Ising configurations vary most significantly along thefirst principal component, whose weight is shown in theinset of Fig. 1. The flat distribution all over the latticesites means the transformation actually gives the uniformmagnetization 1

N

∑i σi. In this sense, PCA has identi-

fied the order parameter of the Ising model (1) upon aphase transition.

Next, we project the samples in the space spannedby the first two principal components, shown in Fig-ure 2. The color of each sample indicates its temperature.The projected coordinates are given by the matrix-vectorproduct

y` = Xw`. (4)

The variation of the data along the first principal axis y1is indeed much stronger than that along the second prin-cipal axis y2. Most importantly, one clearly observes thatas the system size enlarges the samples tend to split intothree clusters. The high-temperature samples lie aroundthe origin while the low-temperature samples lie sym-metrically at finite y1. The samples at the critical tem-perature (light yellow dots) have broad spread because oflarge critical fluctuations. We note that Ref. [13] presentsa different low dimension visualization of the Ising config-urations using stochastic neighbor embedding technique.

1 2 3 4 5 6 7 8 9 10`

0.00

0.02

0.04

0.06

0.08

0.10

λ̃`

w1 w2

w3 w4

0.04

0.02

0.00

0.02

0.04

Figure 4: Explained variance ratios of the COP Ising model.Insets show the weights corresponding to the four leadingprincipal components.

When folding the horizontal axis of Fig. 2 into∑

i |σi|or (

∑i σi)

2 the two clusters associated with the low-temperature phase merge together. With such a linearseparable low dimensional representation of the originaldata, a cluster analysis [36] can easily divide the samplesinto two phases, thus identifying the phase transition.Notice that our unsupervised learning analysis does notonly finds the phase transition and an estimate of thecritical temperature but also provides insight into theorder parameter.

Having established the baseline of applying the un-supervised learning techniques in the prototypical Isingmodel, we now turn to a more challenging case wherethe learner can make nontrivial findings. For this, weconsider the same Ising model Eq. (1) with a conservedorder parameter (COP)

∑i σi ≡ 0. This model describes

classical lattice gasses [19], where the occupation of eachlattice site can be either one or zero and the particlesinteract via a short-range attraction. The conserved to-tal magnetization corresponds to the constraint of a halffilled lattice.

On a square lattice with periodic boundary conditions,the spins tend to form two domains at low-temperaturesshown in Fig. 3(a,b). The two domain walls wrap aroundthe lattice either horizontally or vertically to minimizethe domain wall energy [19]. Besides, the domains canalso shift in space due to translational invariance. Asthe temperature increases, these domain walls melt andthe system restores both the translational and rota-tional symmetries in the high-temperature phase shownin Fig. 3(c). At zero total magnetization, the critical tem-perature of such solid-gas phase transition is the same asthe Ising transition Tc/J ≈ 2.269 [20]. However, sincethe total magnetization is conserved, simply summing upthe Ising spins can not be used as an indicator to distin-guish the two phases. In fact, it is unclear to the authorwhich quantity signifies the phase transition before this

4

Figure 5: Projections of the COP Ising samples to the fourleading principal components.

study. It is, therefore, a good example to demonstratethe ability of the unsupervised learning approach.

We perform the same PCA on the COP Ising config-urations sampled with Monte Carlo simulation [19] andshow the first few explained variance ratios in Fig. 4.Notably, there are four instead of one leading princi-pal components. Their weights plotted in the insets ofFig. 4 show notable nonuniformity over the lattice sites.This indicates that in the COP Ising model the spatialdistribution of the spins varies drastically as the tem-perature changes. Denote Euclidean coordinate of sitei as (µi, νi), where µi, νi = 1, 2, . . . ,

√N . The weights

of the four leading principal components can be writ-ten as cos(θi), cos(φi), sin(θi), sin(φi), where (θi, φi) =(µi, νi)×2π/

√N [37]. Note these four mutually orthogo-

nal weights correspond to the two orientations of the do-main walls shown in Fig. 3(a,b). Therefore, the PCA cor-rectly finds out the rotational symmetry breaking causedby the domain wall formation.

To visualize the samples in the four-dimensional fea-ture space spanned by the first few principal compo-nents, we plot two-dimensional projections in Fig. 5. Inall cases, the high-temperature samples are around theorigin while the low-temperature samples form a sur-rounding cloud. Motivated by the circular shapes of allthese projections, we further reduce to a two-dimensionalspace via a nonlinear transformation (y1, y2, y3, y4) 7→(y21 + y22 , y

23 + y24). As shown in Fig. 6(a), the line∑4

`=1 y2` = const (a four dimensional sphere of a constant

radius) separates the low and high temperature samples.This motivates a further dimension reduction to a singlevariable

∑4`=1 y

2` as an indicator of the phase transition

in the COP Ising model.

Substituting weights of the four principal componentscos(θi), cos(φi), sin(θi), sin(φi), the sum

∑4`=1 y

2` is pro-

Figure 6: (a) Further projection of the COP Ising samples toa two-dimensional space. (b) The structure factor Eq. (5) ofthe COP Ising model versus temperature for various systemsizes.

portional to

S =1

N2

∑i,j

σiσj [cos (θi − θj) + cos (φi − φj)] . (5)

Even though such structure factor was unknown to theauthor before it was discovered by the learner, one canconvince himself it indeed captures the domain wall for-mation at low temperatures shown in Fig. 3(a,b). Fig-ure 6(b) shows the structure factor versus temperaturefor various system sizes. It decreases as the temperatureincreases and clearly serves as a good indicator of thephase transition. We emphasis that the input spin con-figurations contain no information about the lattice ge-ometry nor the Hamiltonian. However, the unsupervisedlearner has by itself extracted meaningful information re-lated to the breaking of the orientational order. There-fore, even without the knowledge of the lattice and theanalytical understanding of the structure factor Eq. (5),∑4

`=1 y2` plays the same role of separating the phases in

the projected space.It is interesting to compare our analysis of phase tran-

sitions to standard imagine recognition applications. Inthe Ising model example, the learner essentially finds outthe brightness of the imagine

∑i σi as an indicator of the

phase transition. While in the COP Ising model exam-ple, instead of detecting sharpness of the edges (meltingof domain walls) following the ordinary imagine recog-nition routine, the PCA learner finds out the structurefactor Eq. (5) related to symmetry breaking, which is afundamental concept in phase transition and condensedmatter physics.

Considering PCA is arguably one of the simplest un-supervised learning techniques, the obtained results arerather encouraging. In essence, our analysis finds out thedominant collective modes of the system related to thephase transition. The approach can be readily general-ized to more complex cases such as models with emergentsymmetry and order by disorder [21]. The unsupervisedlearning approach is particularly profitable in the case of

5

hidden or multiple intertwined orders, where it can helpto single out various phases.

Although nonlinear transformation of the raw config-uration Eq. (5) was discovered via visualization in Fig. 5,simple PCA is however limited to linear transformations.Therefore, it remains challenging to identify more subtlephase transitions related to the topological order, wherethe indicators of the phase transition are nontrivial non-linear functions of the original configurations. For thispurpose, it would be interesting to see if a machine learn-ing approach can comprehend concepts such as dualitytransformation [22], Wilson loop [23] and string order pa-rameter [24]. A judicial apply of kernel techniques [25] orneural network based deep autoencoders [26] may achievesome of these goals.

Furthermore, although our discussions focus on ther-mal phase transitions of the classical Ising model, theunsupervised learning approaches can also be used to an-alyze quantum many-body systems and quantum phasetransitions [27]. In these applications, diagnosing quan-tum states of matter without knowledge of Hamiltonianis a useful paradigm for cases with only access to wave-functions or experimental data.

Acknowledgment The author thanks Xi Dai, Ye-HuaLiu, Yuan Wan, QuanSheng Wu and Ilia Zintchenkofor discussions and encouragement. The author alsothanks Zi Cai for discussions and careful readings of themanuscript. L.W. is supported by the start-up fundingof IOP-CAS.

[1] P. W. Anderson, Basic notions of condensed matterphysics (The Benjamin-Cummings Publishing Company,1984).

[2] X.-G. Wen, Quantum field theory of many-body systems(Oxford University Press, 2004).

[3] S. Curtarolo, D. Morgan, K. Persson, J. Rodgers, andG. Ceder, Physical Review Letters 91, 135503 (2003).

[4] O. S. Ovchinnikov, S. Jesse, P. Bintacchit, S. Trolier-McKinstry, and S. V. Kalinin, Physical Review Letters103, 157203 (2009).

[5] G. Hautier, C. C. Fischer, A. Jain, T. Mueller, andG. Ceder, Chemistry of Materials 22, 3762 (2010).

[6] J. C. Snyder, M. Rupp, K. Hansen, K.-R. Müller, andK. Burke, Physical Review Letters 108, 253002 (2012).

[7] Y. Saad, D. Gao, T. Ngo, S. Bobbitt, J. R. Chelikowsky,and W. Andreoni, Physical Review B 85, 104104 (2012).

[8] E. LeDell, Prabhat, D. Y. Zubarev, B. Austin, andW. A. Lester, Journal of Mathematical Chemistry 50,2043 (2012).

[9] M. Rupp, A. Tkatchenko, K.-R. Müller, and O. A. vonLilienfeld, Physical Review Letters 108, 058301 (2012).

[10] L.-F. Arsenault, A. Lopez-Bezanilla, O. A. von Lilienfeld,and A. J. Millis, Physical Review B 90, 155136 (2014).

[11] G. Pilania, J. E. Gubernatis, and T. Lookman, PhysicalReview B 91, 214302 (2015).

[12] Z. Li, J. R. Kermode, and A. De Vita, Physical ReviewLetters 114, 096405 (2015).

[13] J. Carrasquilla and R. G. Melko, (2016), 1605.01735 .[14] L. Onsager, Physical Review 65, 117 (1944).[15] U. Wolff, Physical Review Letters 62, 361 (1989).[16] B. S. Everitt, S. Landau, M. Leese, and D. Stahl, Cluster

Analysis (Wiley, 2010).[17] K. Pearson, Philosophical Magazine 2, 559 (1901).[18] I. Jolliffe, Principal Component Analysis (John Wiley &

Sons, Ltd, Chichester, UK, 2002).[19] M. Newman and G. T. Barkema, Monte Carlo methods

in statistical physics (Oxford, 1999).[20] C. N. Yang, Physical Review 85, 808 (1952).[21] R. Moessner and S. L. Sondhi, Physical Review B 63,

224401 (2001).[22] F. J. Wegner, Journal of Mathematical Physics 12, 2259

(1971).[23] K. G. Wilson, Physical Review D 10, 2445 (1974).[24] M. den Nijs and K. Rommelse, Physical Review B 40,

4709 (1989).[25] B. Schölkopf, A. Smola, and K. R. Müller, Neural com-

putation (1998).[26] G. E. Hinton and R. R. Salakhutdinov, Science 313, 504

(2006).[27] S. Sachdev, Quantum phase transitions (Cambridge Uni-

versity Press, 2011).[28] L. Saitta, A. Giordana, and A. Cornuéjols, Phase Transi-

tions in Machine Learning (Cambridge University Press,2011).

[29] P. Mehta and D. J. Schwab, (2014), 1410.3831 .[30] E. M. Stoudenmire and D. J. Schwab, (2016),

1605.05775v1 .[31] S. Lloyd, M. Mohseni, and P. Rebentrost, (2013),

1307.0411 .[32] S. R. White, Physical Review Letters 69, 2863 (1992).[33] We also note application of physics ideas such as phase

transition [28], renormalization group [29], tensor net-works [30] and quantum computation [31] to machinelearning.

[34] Each column of X sums up to zero since on average eachsite has zero magnetization.

[35] In practice this eigenproblem is often solved by singu-lar value decomposition of X. In fact, replacing the in-put data X (raw spin configurations collected at varioustemperature) by the wave function of a one-dimensionalquantum system, the math here is identical to the trunca-tion of Schmidt coefficients in the density-matrix renor-malization group calculations [32].

[36] See, for example, the methods provided in thescikit-learn cluster module http://scikit-learn.org/stable/modules/clustering.html

[37] The weights shown in the inset of Fig. 4 are linear mix-tures of them.

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.673.8037

http://dx.doi.org/ 10.1103/PhysRevLett.91.135503

http://dx.doi.org/10.1103/PhysRevLett.103.157203


http://dx.doi.org/10.1021/cm100795d

http://dx.doi.org/ 10.1103/PhysRevLett.108.253002

http://dx.doi.org/10.1103/PhysRevB.85.104104

http://dx.doi.org/ 10.1007/s10910-012-0019-5

http://dx.doi.org/ 10.1007/s10910-012-0019-5







http://arxiv.org/abs/1605.01735


http://dx.doi.org/10.1103/PhysRev.65.117


http://dx.doi.org/10.1080/14786440109462720

http://dx.doi.org/10.1002/9781118445112.stat06472

http://dx.doi.org/10.1103/PhysRevE.68.036113

http://dx.doi.org/10.1103/PhysRevE.68.036113

http://dx.doi.org/10.1103/PhysRev.85.808



http://dx.doi.org/10.1063/1.1665530

http://dx.doi.org/10.1063/1.1665530

http://dx.doi.org/10.1103/PhysRevD.10.2445



http://www.mitpressjournals.org/doi/abs/10.1162/089976698300017467

http://www.mitpressjournals.org/doi/abs/10.1162/089976698300017467

http://dx.doi.org/10.1126/science.1129198

http://dx.doi.org/10.1126/science.1129198

http://books.google.com/books?hl=en&lr=&id=F3IkpxwpqSgC&oi=fnd&pg=PR1&dq=Quantum+Phase+Transitions+(sachdev)&ots=XDrvmYKoTU&sig=2seTP0pE8EDME5RTB9ZOuIeBlOk

http://dx.doi.org/10.1007/978-0-387-30164-8_635

http://dx.doi.org/10.1007/978-0-387-30164-8_635




http://arxiv.org/abs/1605.05775v1



http://link.aps.org/doi/10.1103/PhysRevLett.69.2863

http://scikit-learn.org/stable/modules/clustering.html

http://scikit-learn.org/stable/modules/clustering.html

Discovering Phase Transitions with Unsupervised Learning · model, we now turn to a more...

Documents

Transcript of Discovering Phase Transitions with Unsupervised Learning · model, we now turn to a more...