Envisioning Text Learning Semantic Maps in Hyperbolic Space

77
Introduction Methods Applications Conclusion Envisioning Text Learning Semantic Maps in Hyperbolic Space Jörg Ontrup Neuroinformatics Group Faculty of Technology Bielefeld University Intelligent Information Systems 2005, Gda ´ nsk J. Ontrup Bielefeld University 1 / 25

Transcript of Envisioning Text Learning Semantic Maps in Hyperbolic Space

Page 1: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Envisioning TextLearning Semantic Maps in Hyperbolic Space

Jörg Ontrup

Neuroinformatics GroupFaculty of Technology

Bielefeld University

Intelligent Information Systems 2005, Gdansk

J. Ontrup Bielefeld University 1 / 25

Page 2: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

OutlineInformation Overload

Outline of the Talk

1 Introduction

2 MethodsSelf-organizing mapsHyperbolic spaceLearning of large maps with logarithmic complexityThe “bag of words” vs. the “pyramid of words”

3 ApplicationsOrganisation of large text archivesHandling of temporal dataAn online demonstration

4 Conclusion

J. Ontrup Bielefeld University 2 / 25

Page 3: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

OutlineInformation Overload

Outline of the Talk

1 Introduction2 Methods

Self-organizing mapsHyperbolic spaceLearning of large maps with logarithmic complexityThe “bag of words” vs. the “pyramid of words”

3 ApplicationsOrganisation of large text archivesHandling of temporal dataAn online demonstration

4 Conclusion

J. Ontrup Bielefeld University 2 / 25

Page 4: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

OutlineInformation Overload

Outline of the Talk

1 Introduction2 Methods

Self-organizing mapsHyperbolic spaceLearning of large maps with logarithmic complexityThe “bag of words” vs. the “pyramid of words”

3 ApplicationsOrganisation of large text archivesHandling of temporal dataAn online demonstration

4 Conclusion

J. Ontrup Bielefeld University 2 / 25

Page 5: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

OutlineInformation Overload

Outline of the Talk

1 Introduction2 Methods

Self-organizing mapsHyperbolic spaceLearning of large maps with logarithmic complexityThe “bag of words” vs. the “pyramid of words”

3 ApplicationsOrganisation of large text archivesHandling of temporal dataAn online demonstration

4 Conclusion

J. Ontrup Bielefeld University 2 / 25

Page 6: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

OutlineInformation Overload

Information Overload

40 years of Moore’s “Law”

Exponential growth of transistors per integreated circuit

⇒ corresponding growth in computing power

⇒ equally important: exponential growth in storage capacities

Consequences & examples

We observe an “explosion” of online available information

Consider MEDLINE:Citations from approx. 4800 journals

over 571.000 references added during 2004

⇒ 65 articles are added each hour

J. Ontrup Bielefeld University 3 / 25

Page 7: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

OutlineInformation Overload

Information Overload

40 years of Moore’s “Law”

Exponential growth of transistors per integreated circuit

⇒ corresponding growth in computing power

⇒ equally important: exponential growth in storage capacities

Consequences & examples

We observe an “explosion” of online available information

Consider MEDLINE:Citations from approx. 4800 journals

over 571.000 references added during 2004

⇒ 65 articles are added each hour

J. Ontrup Bielefeld University 3 / 25

Page 8: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

OutlineInformation Overload

Historical Notes - the Memex

“As We May Think”, Vannevar Bush, Juli 1945

An essay in Atlantic Monthly draws attention

TIME Magazine publishes several diagrams

“A memex is a device in which anindividual stores all his books,records, and communications [...]It is an enlarged intimatesupplement to his memory.”

J. Ontrup Bielefeld University 4 / 25

Page 9: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

OutlineInformation Overload

Historical Notes - the Memex

“As We May Think”, Vannevar Bush, Juli 1945

An essay in Atlantic Monthly draws attention

TIME Magazine publishes several diagrams

“A memex is a device in which anindividual stores all his books,records, and communications [...]It is an enlarged intimatesupplement to his memory.”

J. Ontrup Bielefeld University 4 / 25

Page 10: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

OutlineInformation Overload

Today’s Approaches

Classical information retrieval

Massive full-text indexing of web sites

⇒ good when we know what to look for

Machine learning methods

Clustering of query results

⇒ helps to identify topic clusters

Information Visualization

Often used: map metaphor

⇒ allows navigation in information space

J. Ontrup Bielefeld University 5 / 25

Page 11: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

OutlineInformation Overload

Today’s Approaches

Classical information retrieval

Massive full-text indexing of web sites

⇒ good when we know what to look for

Machine learning methods

Clustering of query results

⇒ helps to identify topic clusters

Information Visualization

Often used: map metaphor

⇒ allows navigation in information space

J. Ontrup Bielefeld University 5 / 25

Page 12: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

OutlineInformation Overload

Today’s Approaches

Classical information retrieval

Massive full-text indexing of web sites

⇒ good when we know what to look for

Machine learning methods

Clustering of query results

⇒ helps to identify topic clusters

Information Visualization

Often used: map metaphor

⇒ allows navigation in information space

J. Ontrup Bielefeld University 5 / 25

Page 13: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

OutlineInformation Overload

Future Directions

The Semantic WebFrom pure syntax to semantic information

Semantic annotations lessen the burden to “hit” the rightkeyword

Still quite a way to go:

Manual tagging of content is very expensive

Hard to construct ontologies dealing with general knowledge

Even harder to do that automatically

J. Ontrup Bielefeld University 6 / 25

Page 14: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

OutlineInformation Overload

Future Directions

The Semantic WebFrom pure syntax to semantic information

Semantic annotations lessen the burden to “hit” the rightkeyword

Still quite a way to go:

Manual tagging of content is very expensive

Hard to construct ontologies dealing with general knowledge

Even harder to do that automatically

J. Ontrup Bielefeld University 6 / 25

Page 15: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

How is our Brain Organizing Information?

Consider input from our senses, e.g. the sense of touch:

The Self-Organizing Map as a Model

Introduced by Kohonen in the 80s

Biophysically motivated model

unsupervised learning paradigmgenerates topologically ordered feature maps

Translates data similarities into spatial relations

Widely used as an exploratory data analysis tool

J. Ontrup Bielefeld University 7 / 25

Page 16: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

How is our Brain Organizing Information?

Consider input from our senses, e.g. the sense of touch:

The Self-Organizing Map as a Model

Introduced by Kohonen in the 80s

Biophysically motivated model

unsupervised learning paradigmgenerates topologically ordered feature maps

Translates data similarities into spatial relations

Widely used as an exploratory data analysis tool

J. Ontrup Bielefeld University 7 / 25

Page 17: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

How is our Brain Organizing Information?

Consider input from our senses, e.g. the sense of touch:

The Self-Organizing Map as a Model

Introduced by Kohonen in the 80s

Biophysically motivated model

unsupervised learning paradigmgenerates topologically ordered feature maps

Translates data similarities into spatial relations

Widely used as an exploratory data analysis tool

J. Ontrup Bielefeld University 7 / 25

Page 18: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

How is our Brain Organizing Information?

Consider input from our senses, e.g. the sense of touch:

The Self-Organizing Map as a Model

Introduced by Kohonen in the 80s

Biophysically motivated model

unsupervised learning paradigmgenerates topologically ordered feature maps

Translates data similarities into spatial relations

Widely used as an exploratory data analysis tool

J. Ontrup Bielefeld University 7 / 25

Page 19: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

How is our Brain Organizing Information?

Consider input from our senses, e.g. the sense of touch:

The Self-Organizing Map as a Model

Introduced by Kohonen in the 80s

Biophysically motivated model

unsupervised learning paradigmgenerates topologically ordered feature maps

Translates data similarities into spatial relations

Widely used as an exploratory data analysis tool

J. Ontrup Bielefeld University 7 / 25

Page 20: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Self-Organizing Maps

Cortex is modelled by neurons placed on lattice structureMost prominently used are grids in 2D Euclidean space

SOM algorithm:

Lattice of neuronsprototype vectors attached

High−dimensionalinput space

J. Ontrup Bielefeld University 8 / 25

Page 21: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Self-Organizing Maps

Cortex is modelled by neurons placed on lattice structureMost prominently used are grids in 2D Euclidean space

SOM algorithm:

High−dimensionalinput space Lattice of neurons

prototype vectors attached

J. Ontrup Bielefeld University 8 / 25

Page 22: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Self-Organizing Maps

Cortex is modelled by neurons placed on lattice structureMost prominently used are grids in 2D Euclidean space

SOM algorithm:

High−dimensionalinput space Lattice of neurons

prototype vectors attached

J. Ontrup Bielefeld University 8 / 25

Page 23: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Self-Organizing Maps

Cortex is modelled by neurons placed on lattice structureMost prominently used are grids in 2D Euclidean space

SOM algorithm:

Lattice of neuronsprototype vectors attached

High−dimensionalinput space

x

J. Ontrup Bielefeld University 8 / 25

Page 24: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Self-Organizing Maps

Cortex is modelled by neurons placed on lattice structureMost prominently used are grids in 2D Euclidean space

SOM algorithm:

Lattice of neuronsprototype vectors attached

High−dimensionalinput space

x

J. Ontrup Bielefeld University 8 / 25

Page 25: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Self-Organizing Mapsanimal small medium big 2 legs hooves hunter can fly swimmer

cow 0 0 1 0 1 0 0 0zebra 0 0 1 0 1 0 0 0horse 0 0 1 0 1 0 0 0lion 0 0 1 0 1 1 0 0tiger 0 0 1 0 1 1 0 0cat 1 0 0 0 1 1 0 0wolf 0 1 0 0 1 1 0 0dog 0 1 0 0 1 0 0 0fox 0 1 0 0 1 1 0 0

eagle 0 1 0 1 0 1 1 0hawk 1 0 0 1 0 1 1 0owl 1 0 0 1 0 1 1 0

goose 1 0 0 1 0 0 1 1duck 1 0 0 1 0 0 0 1hen 1 0 0 1 0 0 0 0dove 1 0 0 1 0 0 1 0

J. Ontrup Bielefeld University 9 / 25

Page 26: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Self-Organizing Maps

J. Ontrup Bielefeld University 9 / 25

Page 27: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

SOM-based Document Maps

The WebSOM project with 7 million patent abstracts

André Skupin applied cartographic design concepts:

J. Ontrup Bielefeld University 10 / 25

Page 28: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

SOM-based Document Maps

The WebSOM project with 7 million patent abstracts

André Skupin applied cartographic design concepts:

J. Ontrup Bielefeld University 10 / 25

Page 29: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Pros and Cons of the standard SOM

Strengths

Completely unsupervised method

Provides map metaphor well suited for visualization purposes

Achieves good classification rates, can capture non-linearities

Has proven its applicabilty in many domains

Limitations

Resolution depends on map area A ∝ N2

⇒ Training of large maps is computationally expensive

Large maps do not fit on limited screens

⇒ Demand for focus & context techniques

J. Ontrup Bielefeld University 11 / 25

Page 30: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Pros and Cons of the standard SOM

Strengths

Completely unsupervised method

Provides map metaphor well suited for visualization purposes

Achieves good classification rates, can capture non-linearities

Has proven its applicabilty in many domains

Limitations

Resolution depends on map area A ∝ N2

⇒ Training of large maps is computationally expensive

Large maps do not fit on limited screens

⇒ Demand for focus & context techniques

J. Ontrup Bielefeld University 11 / 25

Page 31: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Going to Hyperbolic Space

World of Euclidean geometry is following Euclid’s five axioms

Euclid’s 5th or “parallel postulate” was long eyed suspiciously:

l

P Through a given point, not on agiven line only one parallel canbe drawn to the given line.

Around 1820 Gauss, Lobachevsky & Bolyai independentlynegated the 5th postulate

⇒ They found a completely consistent non-Euclidean geometry

⇒ characterized by being negativley “curved”

J. Ontrup Bielefeld University 12 / 25

Page 32: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Going to Hyperbolic Space

World of Euclidean geometry is following Euclid’s five axioms

Euclid’s 5th or “parallel postulate” was long eyed suspiciously:

l

P Through a given point, not on agiven line

only one parallel canbe drawn to the given line.

Around 1820 Gauss, Lobachevsky & Bolyai independentlynegated the 5th postulate

⇒ They found a completely consistent non-Euclidean geometry

⇒ characterized by being negativley “curved”

J. Ontrup Bielefeld University 12 / 25

Page 33: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Going to Hyperbolic Space

World of Euclidean geometry is following Euclid’s five axioms

Euclid’s 5th or “parallel postulate” was long eyed suspiciously:

l

gP Through a given point, not on a

given line only one parallel canbe drawn to the given line.

Around 1820 Gauss, Lobachevsky & Bolyai independentlynegated the 5th postulate

⇒ They found a completely consistent non-Euclidean geometry

⇒ characterized by being negativley “curved”

J. Ontrup Bielefeld University 12 / 25

Page 34: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Going to Hyperbolic Space

World of Euclidean geometry is following Euclid’s five axioms

Euclid’s 5th or “parallel postulate” was long eyed suspiciously:

l

gP Through a given point, not on a

given line only one parallel canbe drawn to the given line.

Around 1820 Gauss, Lobachevsky & Bolyai independentlynegated the 5th postulate

⇒ They found a completely consistent non-Euclidean geometry

⇒ characterized by being negativley “curved”

J. Ontrup Bielefeld University 12 / 25

Page 35: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Going to Hyperbolic Space

World of Euclidean geometry is following Euclid’s five axioms

Euclid’s 5th or “parallel postulate” was long eyed suspiciously:

l

gP Through a given point, not on a

given line only one parallel canbe drawn to the given line.

Around 1820 Gauss, Lobachevsky & Bolyai independentlynegated the 5th postulate

⇒ They found a completely consistent non-Euclidean geometry

⇒ characterized by being negativley “curved”

J. Ontrup Bielefeld University 12 / 25

Page 36: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Going to Hyperbolic Space

How does the hyperbolic plane IH2 look like?

Locally isometric embbeding in IR3

⇒ yields “wrinkled” structure⇒ resembles a saddle at every point

of surface

Characteristics of Hyperbolic Space

For area A and circumference C of circle with radius r holds:

A(r) = 4π sinh2(r/2), C(r) = 2π sinh(r)

⇒ A and C grow asymptotically exponentially with the radius

J. Ontrup Bielefeld University 13 / 25

Page 37: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Going to Hyperbolic Space

How does the hyperbolic plane IH2 look like?

Locally isometric embbeding in IR3

⇒ yields “wrinkled” structure⇒ resembles a saddle at every point

of surface

Characteristics of Hyperbolic Space

For area A and circumference C of circle with radius r holds:

A(r) = 4π sinh2(r/2), C(r) = 2π sinh(r)

⇒ A and C grow asymptotically exponentially with the radius

J. Ontrup Bielefeld University 13 / 25

Page 38: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Projection of IH2

How can we project infinite plane IH2 onto Euclidean screen?

The Poincaré Disk

Under embedding of IH2 in Minkowski-Space

x = sinh(ρ) cos(φ)

y = sinh(ρ) sin(φ)

u = cosh(ρ)

⇒ IH2 appears as surface M whichcan be projected to the unit disk

S

1

M

BC

Au

OD

J. Ontrup Bielefeld University 14 / 25

Page 39: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Projection of IH2

How can we project infinite plane IH2 onto Euclidean screen?

The Poincaré Disk

Under embedding of IH2 in Minkowski-Space

x = sinh(ρ) cos(φ)

y = sinh(ρ) sin(φ)

u = cosh(ρ)

⇒ IH2 appears as surface M whichcan be projected to the unit disk

S

1

M

BC

Au

OD

J. Ontrup Bielefeld University 14 / 25

Page 40: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

The Poincaré Disk

Properties of the Poincaré Modell

Locally shape preserving

Non-isometric with strong“fish-eye” effect:

Fovea is mapped faithfully

Distant regions are “squeezed”

Möbius transformation allowselegant translation of fovea

J. Ontrup Bielefeld University 15 / 25

Page 41: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

The Poincaré Disk

Properties of the Poincaré Modell

Locally shape preserving

Non-isometric with strong“fish-eye” effect:

Fovea is mapped faithfully

Distant regions are “squeezed”

Möbius transformation allowselegant translation of fovea

J. Ontrup Bielefeld University 15 / 25

Page 42: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

The Poincaré Disk

Properties of the Poincaré Modell

Locally shape preserving

Non-isometric with strong“fish-eye” effect:

Fovea is mapped faithfully

Distant regions are “squeezed”

Möbius transformation allowselegant translation of fovea

J. Ontrup Bielefeld University 15 / 25

Page 43: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

The Poincaré Disk

Properties of the Poincaré Modell

Locally shape preserving

Non-isometric with strong“fish-eye” effect:

Fovea is mapped faithfully

Distant regions are “squeezed”

Möbius transformation allowselegant translation of fovea

J. Ontrup Bielefeld University 15 / 25

Page 44: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

The Hyperbolic Self-Organizing MapHas been introduced by Helge Ritter in1999

Nodes are placed on vertices of regulartesselation of IH2 with equilateral triangles

Pros

Shows favourable classification results

Allows for intuitive browsing

Cons

Training of very large maps still expensive

Most things “happen” at the perimeter

J. Ontrup Bielefeld University 16 / 25

Page 45: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

The Hyperbolic Self-Organizing MapHas been introduced by Helge Ritter in1999

Nodes are placed on vertices of regulartesselation of IH2 with equilateral triangles

Pros

Shows favourable classification results

Allows for intuitive browsing

Cons

Training of very large maps still expensive

Most things “happen” at the perimeter

J. Ontrup Bielefeld University 16 / 25

Page 46: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

The Hyperbolic Self-Organizing MapHas been introduced by Helge Ritter in1999

Nodes are placed on vertices of regulartesselation of IH2 with equilateral triangles

Pros

Shows favourable classification results

Allows for intuitive browsing

Cons

Training of very large maps still expensive

Most things “happen” at the perimeter

J. Ontrup Bielefeld University 16 / 25

Page 47: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Architecture of the H 2SOM

1 Initialization:start with nb equilateral trianglesplace nodes at outer vertices

train nodes with traditional SOMapproach:

∆wa = ε(t) h(a, a∗) (x − wa)

h(a, a∗) = exp

d2a,a∗

σ(t)2

!

da,a∗ = 2 arctanh„

|za − za∗ ||1 − zaza∗ |

«nodes now form 1st level of hierarchy

J. Ontrup Bielefeld University 17 / 25

Page 48: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Architecture of the H 2SOM

1 Initialization:start with nb equilateral trianglesplace nodes at outer verticestrain nodes with traditional SOMapproach:

∆wa = ε(t) h(a, a∗) (x − wa)

h(a, a∗) = exp

d2a,a∗

σ(t)2

!

da,a∗ = 2 arctanh„

|za − za∗ ||1 − zaza∗ |

«nodes now form 1st level of hierarchy

J. Ontrup Bielefeld University 17 / 25

Page 49: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Architecture of the H 2SOM

2 Growth Step:

compute quantization error for nodes& mark those with QE > ΘQE

expand nodes with QE > ΘQE

fixate neurons from inner hierarchies& “move” adaptation to new structurallevelreiterate growing step

J. Ontrup Bielefeld University 17 / 25

Page 50: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Architecture of the H 2SOM

2 Growth Step:

compute quantization error for nodes& mark those with QE > ΘQE

expand nodes with QE > ΘQE

fixate neurons from inner hierarchies& “move” adaptation to new structurallevelreiterate growing step

J. Ontrup Bielefeld University 17 / 25

Page 51: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Architecture of the H 2SOM

2 Growth Step:

compute quantization error for nodes& mark those with QE > ΘQE

expand nodes with QE > ΘQE

fixate neurons from inner hierarchies& “move” adaptation to new structurallevel

reiterate growing step

J. Ontrup Bielefeld University 17 / 25

Page 52: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Architecture of the H 2SOM

2 Growth Step:

compute quantization error for nodes& mark those with QE > ΘQE

expand nodes with QE > ΘQE

fixate neurons from inner hierarchies& “move” adaptation to new structurallevelreiterate growing step

J. Ontrup Bielefeld University 17 / 25

Page 53: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Architecture of the H 2SOM

2 Growth Step:

compute quantization error for nodes& mark those with QE > ΘQE

expand nodes with QE > ΘQE

fixate neurons from inner hierarchies& “move” adaptation to new structurallevelreiterate growing step

J. Ontrup Bielefeld University 17 / 25

Page 54: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Architecture of the H 2SOM

2 Growth Step:

compute quantization error for nodes& mark those with QE > ΘQE

expand nodes with QE > ΘQE

fixate neurons from inner hierarchies& “move” adaptation to new structurallevelreiterate growing step

J. Ontrup Bielefeld University 17 / 25

Page 55: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Architecture of the H 2SOM

2 Growth Step:

compute quantization error for nodes& mark those with QE > ΘQE

expand nodes with QE > ΘQE

fixate neurons from inner hierarchies& “move” adaptation to new structurallevelreiterate growing step

J. Ontrup Bielefeld University 17 / 25

Page 56: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Fast Tree Search“Uniformly hierarchical” structure of hyperbolic grid allowssignificant speed-up for finding the winner neuron:

J. Ontrup Bielefeld University 18 / 25

Page 57: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Fast Tree Search“Uniformly hierarchical” structure of hyperbolic grid allowssignificant speed-up for finding the winner neuron:

⇒ for k = 1 we need O(lognbN) comparisons, instead of O(N)

Training of very large maps with logarithmic complexity

J. Ontrup Bielefeld University 18 / 25

Page 58: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Fast Tree Search“Uniformly hierarchical” structure of hyperbolic grid allowssignificant speed-up for finding the winner neuron:

Benchmark with MNIST database of handwritten digits

training time: 13 min vs. 18h 34min

classification: 94.6% vs. 92.7%

J. Ontrup Bielefeld University 18 / 25

Page 59: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Text Representation

Standard approach for text representation: “bag of words”

chianti

20110

001

... ...

tomatoauberginestrawberryraspberry

merlot

vodka

J. Ontrup Bielefeld University 19 / 25

Page 60: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Text Representation

Alternative: use WordNet and its hypernym hierarchy

⇒ We might call it a “pyramid of words”

J. Ontrup Bielefeld University 19 / 25

Page 61: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Text Representation

⇒ WordNet allows for a hierarchical representation of text data:

...

5 foodvegetablefruit

...

22

1 alcoholchianti

20110

001

... ...

tomatoauberginestrawberryraspberry

merlot

vodka

J. Ontrup Bielefeld University 19 / 25

Page 62: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Self-Organizing MapsHyperbolic SpacePyramids of Words

Hierarchical Representation & Training

Hierarchical data representation ideally suited for hierarchicalmaps

⇒ train the map using the “pyramid of words”:

foodvegetablefruitalcohol

tomatoauberginestrawberryraspberrymerlotchiantivodka

J. Ontrup Bielefeld University 20 / 25

Page 63: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Organisation of Large Text ArchivesHandling of Temporal DataOnline Demonstration

Organisation of Large Text Archives

1 Initial map is trained with complete document set

J. Ontrup Bielefeld University 21 / 25

Page 64: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Organisation of Large Text ArchivesHandling of Temporal DataOnline Demonstration

Organisation of Large Text Archives

2 First structural level partitions dataset into subclusters

J. Ontrup Bielefeld University 21 / 25

Page 65: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Organisation of Large Text ArchivesHandling of Temporal DataOnline Demonstration

Organisation of Large Text Archives

3 Nodes are expanded and trained by fast tree search approach

J. Ontrup Bielefeld University 21 / 25

Page 66: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Organisation of Large Text ArchivesHandling of Temporal DataOnline Demonstration

Organisation of Large Text Archives

4 Growing network splits documents in topological orderedclusters of decreasing size

J. Ontrup Bielefeld University 21 / 25

Page 67: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Organisation of Large Text ArchivesHandling of Temporal DataOnline Demonstration

Organisation of Large Text Archives

4 Growing network splits documents in topological orderedclusters of decreasing size

J. Ontrup Bielefeld University 21 / 25

Page 68: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Organisation of Large Text ArchivesHandling of Temporal DataOnline Demonstration

Event Detection in Document Streams

Many documents contain temporal information

News feeds & tickers such as Reuters

Chatroom messages & web forums

Mailing lists or incoming e-mails

Attach time dependent activation potential to HSOM’s neurons

ai(t) = β ai(t − 1) + Si(t)

Si(t) =

{1 if i is best-match node at arrival time t

0 otherwise

β : decay factor controlling amount of leakage

J. Ontrup Bielefeld University 22 / 25

Page 69: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Organisation of Large Text ArchivesHandling of Temporal DataOnline Demonstration

Event Detection in Document Streams

Many documents contain temporal information

News feeds & tickers such as Reuters

Chatroom messages & web forums

Mailing lists or incoming e-mails

Attach time dependent activation potential to HSOM’s neurons

ai(t) = β ai(t − 1) + Si(t)

Si(t) =

{1 if i is best-match node at arrival time t

0 otherwise

β : decay factor controlling amount of leakage

J. Ontrup Bielefeld University 22 / 25

Page 70: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Organisation of Large Text ArchivesHandling of Temporal DataOnline Demonstration

Demo

J. Ontrup Bielefeld University 23 / 25

Page 71: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Organisation of Large Text ArchivesHandling of Temporal DataOnline Demonstration

Demo

J. Ontrup Bielefeld University 23 / 25

Page 72: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Organisation of Large Text ArchivesHandling of Temporal DataOnline Demonstration

Demo

J. Ontrup Bielefeld University 23 / 25

Page 73: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Organisation of Large Text ArchivesHandling of Temporal DataOnline Demonstration

Demo

J. Ontrup Bielefeld University 23 / 25

Page 74: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Organisation of Large Text ArchivesHandling of Temporal DataOnline Demonstration

Demo

J. Ontrup Bielefeld University 23 / 25

Page 75: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Organisation of Large Text ArchivesHandling of Temporal DataOnline Demonstration

Demo

J. Ontrup Bielefeld University 23 / 25

Page 76: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Conclusion

Summary

Hyperbolic space offers more freedom to build hierarchical maps

The peculiar geometry allows for focus & context enhancedbrowsing

Large maps of documents can be build with logarithmic complexity

J. Ontrup Bielefeld University 24 / 25

Page 77: Envisioning Text Learning Semantic Maps in Hyperbolic Space

IntroductionMethodsApplicationsConclusion

Thank you very much!

J. Ontrup Bielefeld University 25 / 25