Scienfic and Large Data Visualizaon 29 November 2017 High...

78
Scien&fic and Large Data Visualiza&on 29 November 2017 High Dimensional Data – Part II Massimiliano Corsini Visual Compu,ng Lab, ISTI - CNR - Italy

Transcript of Scienfic and Large Data Visualizaon 29 November 2017 High...

Page 1: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Scien&ficandLargeDataVisualiza&on29November2017

HighDimensionalData–PartII

MassimilianoCorsiniVisualCompu,ngLab,ISTI-CNR-Italy

Page 2: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Overview

•  GraphsExtensions•  Glyphs

–  ChernoffFaces–  Mul&-dimensionalIcons

•  ParallelCoordinates•  StarPlots•  DimensionalityReduc&on

–  PrincipalComponentAnalysis(PCA)–  LocallyLinearEmbedding(LLE)–  IsoMap–  SummonMapping–  t-SNE

Page 3: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

DimensionalityReduc,on

•  N-dimensionaldataareprojectedto2or3dimensionsforbeCervisualiza,on/understanding.

•  Widelyusedstrategy.•  Ingeneral,itisamappingnotageometrictransforma,on.

•  Differentmappingshavedifferentproper,es.

Page 4: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

PrincipalComponentAnalysis(PCA)

•  Aclassicmul,-dimensionalreduc,ontechniqueisPrincipalComponentAnalysis(PCA).

•  Itisalinearnon-parametrictechnique.•  Thecoreideatofindabasisformedbythedirec,onsthatmaximizethevarianceofthedata.

Page 5: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

PCAasachangeofbasis

•  Theideaistoexpressthedatainanewbasis,thatbestexpressourdataset.

•  Thenewbasisisalinearcombina,onoftheoriginalbasis.

Page 6: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

PCAasachangeofbasis

Page 7: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Signal-to-noiseRa,o(SNR)

•  Givenasignalwithnoise:

•  Itcanbeexpressedas:

Page 8: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Redundancy

Redundantvariablesconveynorelevantinforma&on!

Figure From Jonathon Shlens, “A Tutorial on Principal Component Analysis”, arXiv preprint arXiv:1404.1100, 2015.

Page 9: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

CovarianceMatrix

•  Squaresymmetricmatrix.•  Thediagonaltermsarethevarianceofapar,cularvariable.

•  Theoff-diagonaltermsarethecovariancebetweenthedifferentvariables.

Page 10: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Goals

•  HowtoselectthebestP?– Minimizeredundancy– Maximizethevariance

•  Goal:todiagonalizethecovariancematrixofY– Highvaluesofthediagonaltermsmeansthatthedynamicsofthesinglevariableshasbeenmaximized.

– Lowvaluesoftheoff-diagonaltermsmeansthattheredundancybetweenvariablesisminimized.

Page 11: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

SolvingPCARememberthat

Page 12: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

SolvingPCA

•  Theorem:asymmetricmatrixAcanbediagonalizedbyamatrixformedbyitseigenvectorsasA=EDET.

•  ThecolumnofEaretheeigenvectorsofA.

Page 13: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

PCAComputa,on

•  Organizethedataasanmxnmatrix.•  Subtractthecorrespondingmeantoeachrow.•  CalculatetheeigenvaluesandeigenvectorsofXXT.

•  OrganizethemtoformthematrixP.

Page 14: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

PCAforDimensionalityReduc,on

•  Theideaistofindthek-thprincipalcomponents(k<m).

•  Projectthedataonthesedirec,onsandusesuchdatainsteadoftheoriginalones.

•  Thisdataarethebestapproxima,onw.r.tthesumofthesquareddifferences.

Page 15: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

PCAastheProjec,onthatMinimizestheReconstruc,onError•  Ifweuseonlythefirstk<mcomponentsweobtainthebestreconstruc,onintermsofsquarederror.

Datapointprojectedonthefirstkcomponents.

Datapointprojectedonallthecomponents.

Page 16: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

PCAastheProjec,onthatMinimizestheReconstruc,onError

Page 17: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Example

Figure From Jonathon Shlens, “A Tutorial on Principal Component Analysis”, arXiv preprint arXiv:1404.1100, 2015.

Page 18: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

PCA–Example

Eachmeasurehas6dimensions(!)ButtheballmovesalongtheX-axisonly..

Page 19: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

LimitsofPCA

•  Itisnon-parametricàthisisastrengthpointbutitcanbealsoaweakpoint.

•  Itfailsfornon-Gaussiandistributeddata.•  Itcanbeextendedtoaccountfornon-lineartransforma,onàkernelPCA.

Page 20: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

LimitsofPCA

ICAguaranteessta&s&calindependenceà

Page 21: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

ClassicMDS

•  Findthelinearmappingwhichminimizes:

Euclideandistanceinlowdimensionalspace

Euclideandistanceinhighdimensionalspace

Page 22: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

PCAandMDS

•  Wewanttominimize,thiscorrespondstomaximize:

Thatisthevarianceofthelow-dimensionalpoints(samegoalofthePCA).

Page 23: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

PCAandMDS

•  Thesizeofthecovariancematrixispropor,onaltothedimensionofthedata.

•  MDSscaleswiththenumberofdatapointsinsteadofthedimensionsofthedata.

•  BothPCAandMDSpreservebeCerlargepairwisedistances.

Page 24: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

LocallyLinearEmbedding(LLE)•  LLEaCemptstodiscovernonlinearstructureinhighdimensionbyexploi,nglocallinearapproxima,on.

MappingDiscoveredNonlinearManifoldM SamplesonM

Page 25: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

LocallyLinearEmbedding(LLE)

•  INTUITIONàassumingthatthereissufficientdata(well-sampledmanifold)weexpecteachdatapointanditsneighborscanbeapproximatedbyalocallinearpatch.

•  Thepatchisrepresentedbyaweightedsumofthelocaldatapoints.

Page 26: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

ComputeLocalPatch

•  Chooseasetofdatapointsclosetoagivenone(ball-radiusorK-nearestneighbours).

•  Solvefor:

Page 27: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

LLEMapping

•  Findwhichminimizestheembeddingcostfunc,on:

Notethatweightsarefixedinthiscase!

Page 28: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

LLEAlgorithm

1.  Computetheneighborsofeachdatapoint,.

2.  Computetheweightsthatbestreconstruct.

3.  Computethevectorsthatminimizesthecostfunc,on.

Page 29: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

LLE–Example

PCAfailstopreservetheneighborhoodstructureofthenearbyimages.

Page 30: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

LLE–Example

Page 31: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

ISOMAP

•  Thecoreideaistopreservethegeodesicdistancebetweendatapoints.

•  Geodesicistheshortestpathbetweentwopointsonacurvedspace.

Page 32: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

ISOMAP

Euclideandistancevs

Geodesicdistance

Graphbuildand

GeodesicdistanceApproxima&on

Geodesicdistancevs

ApproximatedGeodesic

Page 33: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

ISOMAP

•  Constructneighborhoodgraph– DefinegraphGoveralldatapointsbyconnec,ngpoints(i,j)ifandonlyifthepointiisaKneareastneighborofpointj

•  Computetheshortestpath– UsingtheFloyd’salgorithm

•  Constructthed-dimensionalembedding

Page 34: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

ISOMAP

Page 35: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

ISOMAP

Page 36: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Autoencoders

•  MachinelearningisbecomingubiquitousinComputerScience.

•  Aspecialtypeofneuralnetworkiscalledautoencoder.

•  Anautoencodercanbeusedtoperformdimensionalityreduc,on.

•  First,letmesaysomethingaboutneuralnetwork..

Page 37: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Autoencoder

Low-dimensional Representation

Page 38: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Mul,-layerAutoencoder

Page 39: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

SummonMapping

•  Adapta,onofMDSbyweigh,ngthecontribu,onofeach(i,j)pair:

•  ThisallowstoretainthelocalstructureofthedatabeCerthanclassicalscaling(theretainofhighdistancesisnotprivileged).

Page 40: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

t-SNE

•  Mosttechniquesfordimensionalityreduc,onarenotabletoretainboththelocalandtheglobalstructureofthedatainasinglemap.

•  SimpletestsonhandwriCendigitsdemonstratethis(Songetal.2007).

L. Song, A. J. Smola, K. Borgwardt and A. Gretton, “Colored Maximum Variance Unfolding”, in Advances in Neural Information Processing Systems. Vol. 21, 2007.

Page 41: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Stochas,cNeighborEmbedding(SNE)

•  Similari,esbetweenhigh-andlow-dimensionaldatapointsismodeledwithcondi,onalprobabili,es.

•  Condi,onalprobabilitythatthepointxiwouldpeakxjasitsneighbor:

Page 42: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Stochas,cNeighborEmbedding(SNE)

•  Weareinterestedonlyinpairwisedistance

•  Forthelow-dimensionalpointsananalogouscondi,onalprobabilityisused:

Page 43: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Kullback-LeiblerDivergence

•  Codingtheory:expectednumberofextrabitsrequiredtocodesamplesfromthedistribu,onPifthecurrentcodeisop,mizeforthedistribu,onQ.

•  Bayesianview:ameasureoftheinforma,ongainedwhenonerevisesone'sbeliefsfromthepriordistribu,onQtotheposteriordistribu,onP.

•  ItisalsocalledrelaDveentropy.

Page 44: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Kullback-LeiblerDivergence

•  Defini,onfordiscretedistribu,ons:

•  Defini,onforcon,nuosdistribu,ons:

Page 45: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Stochas,cNeighborEmbedding(SNE)

•  Thegoalistominimizesthemismatchbetweenpj|iandqj|i.

•  UsingtheKullback-Leiblerdivergencethisgoalcanbeachievedbyminimizingthefunc,on:

NotethatKL(P||Q)isnotsymmetric!

Page 46: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

ProblemsofSNE

•  Thecostfunc,onisdifficulttoop,mize.•  SNEsuffers,asotherdimensionalityreduc,ontechniques,ofthecrowdingproblem.

Page 47: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

t-SNE

•  SNEismadesymmetric:

•  ItemploysaStudent-tdistribu,oninsteadofaGaussiandistribu,ontoevaluatethesimilaritybetweenpointsinlowdimension.

Page 48: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

t-SNEAdvantages

•  Thecrowdingproblemisalleviated.•  Op,miza,onismadesimpler.

Page 49: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Experiments

•  ComparisonwithLLE,IsomapandSummonMapping.

•  Datasets:– MNISTdataset– Oliveffacedataset– COIL-20dataset

Comparison figures are from the paper L.J.P. van der Maaten and G.E. Hinton, “Visualizing High-Dimensional Data Using t-SNE”, Journal of Machine Learning Research, Vol. 9, pp. 2579-2605, 2008.

Page 50: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

MNISTDataset

•  60,000imagesofhandwriCendigits.•  Imageresolu,on:28x28(784dimensions).•  Asubsetof6,000imagesrandomlyselectedhasbeenused.

Page 51: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

MNISTt-SNE

Page 52: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

MNISTSummonMapping

Page 53: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

MNISTLLE

Page 54: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

MNISTIsomap

Page 55: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

COIL-20Dataset

•  Imagesof20objectsviewedfrom72differentviewpoints(1440images).

•  Imagesize:32x32(1024dimensions).

Page 56: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

COIL-20Dataset

...

Page 57: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

t-SNE

LLEIsomap

SummonMapping

Page 58: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Objects

Arrangement

Page 59: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Mo,va,ons

•  Mul,dimensionalreduc,oncanbeusedtoarrangeobjectsin2Dor3Dpreservingpairwisedistances(butthefinalplacementisarbitrary).

•  Manyapplica,onsrequiretoplacetheobjectsinasetofpre-defined,discrete,posi,ons(e.g.onagrid).

Page 60: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Example–ImagesofFlowers

Random Order

Page 61: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Example–ImagesofFlowers

Isomap

Page 62: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Example–ImagesofFlowers

IsoMatch (computed on colors)

Page 63: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

ProblemStatement

Originalpairwisedistance

Euclideandistanceinthegrid

Permuta&on

Thegoalistofindthepermuta&onπthatminimizesthefollowingenergy:

Page 64: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

IsoMatch–Algorithm

•  StepI:DimensionalityReduc,on(usingIsomap)

•  StepII:CoarseAlignment(boundingbox)•  StepIII:Bipar,teMatching•  StepIV(op,onal):RandomRefinement(elementsswap)

Page 65: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Algorithm–StepIDimensionalityReduc,on

Page 66: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Algorithm–StepIICoarseAlignment

Page 67: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Bipar,teMatching

•  Acompletebipar,tegraphisbuilt(onewiththestar,ngloca,ons,onewiththetargetloca,ons)

•  Thearc(i,j)isweightedaccordingtothecorrespondingpairwisedistance.

•  Aminimalbipar,tematchingiscalculatedusingtheHungarianalgorithm.

Page 68: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Algorithm–StepIIIBipar,teMatching(graphbuilt)

Page 69: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Algorithm–StepIIIBipar,teMatching

Page 70: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Algorithm–StepIIIFinalAssignment

Page 71: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

AverageColors

Page 72: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

WordSimilarity

Page 73: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

PileBars

•  Anewtypeofthumbnailbar.•  Paradigm:focus+context.•  Objectsarearrangedinasmallspace(imagesaresubdividedintoclusterstosavespace).

•  Supportanyimage-imagedistance.•  PileBarsaredynamic!

Page 74: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

PileBars–Layouts

Page 75: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Slots

1 image 2 images 12 images 4 images 3 images

Page 76: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

PileBars

•  Thumbnailsaredynamicallyrearranged,resizedandreclusteredadap,velyduringthebrowsing.

•  ThisisdoneinawaytoensuresmoothtransiDons.

Page 77: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

PileBars-Applica,onExampleNaviga,onofRegisteredPhotographs

Take a look at http://vcg.isti.cnr.it/photocloud .

Page 78: Scienfic and Large Data Visualizaon 29 November 2017 High ...vcg.isti.cnr.it/~cignoni/SciViz1718/SciViz_18 High Dimensional Data II.pdf · • The diagonal terms are the variance

Ques&ons?