Anuska Ferligojˇ Vladimir Batagelj University of...

48
Blockmodeling Anuˇ ska Ferligoj Vladimir Batagelj University of Ljubljana European Science Foundation QMSS Workshop Ljubljana, Saturday, July 2, 2005, 9:00-12:30 version: June 29, 2005 / 01 : 19

Transcript of Anuska Ferligojˇ Vladimir Batagelj University of...

Page 1: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

'

&

$

%

Blockmodeling

Anuska FerligojVladimir Batagelj

University of Ljubljana

European Science FoundationQMSS Workshop

Ljubljana, Saturday, July 2, 2005, 9:00-12:30

version: June 29, 2005 / 01 : 19

Page 2: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 2'

&

$

%

Outline1 Matrix rearrangement view on blockmodeling . . . . . . . . . . . . . . . . . . . 1

2 Blockmodeling as a clustering problem . . . . . . . . . . . . . . . . . . . . . . 2

13 Indirect Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

22 Example: Support network among informatics students . . . . . . . . . . . . . . 22

27 Direct Approach and Generalized Blockmodeling . . . . . . . . . . . . . . . . . 27

37 Pre-specified blockmodeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

44 Blockmodeling in 2-mode networks . . . . . . . . . . . . . . . . . . . . . . . . 44

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 3: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 1'

&

$

%

Matrix rearrangement view on blockmodelingSnyder & Kick’s World trade network / n = 118, m = 514

Pajek - shadow 0.00,1.00 Sep- 5-1998World trade - alphabetic order

afg alb alg arg aus aut bel bol bra brm bul bur cam can car cha chd chi col con cos cub cyp cze dah den dom ecu ege egy els eth fin fra gab gha gre gua gui hai hon hun ice ind ins ire irn irq isr ita ivo jam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla mli mon mor nau nep net nic nig nir nor nze pak pan par per phi pol por rum rwa saf sau sen sie som spa sri sud swe swi syr tai tha tog tri tun tur uga uki upv uru usa usr ven vnd vnr wge yem yug zai

afg

a

lb

alg

a

rg

aus

a

ut

bel

b

ol

bra

b

rm

bul

b

ur

cam

c

an

car

c

ha

chd

c

hi

col

c

on

cos

c

ub

cyp

c

ze

dah

d

en

dom

e

cu

ege

e

gy

els

e

th

fin

fra

gab

g

ha

gre

g

ua

gui

h

ai

hon

h

un

ice

ind

ins

ire

irn

irq

isr

ita

ivo

jam

ja

p

jo

r

k

en

km

r

k

od

kor

k

uw

lao

leb

lib

liy

lux

maa

m

at

mex

m

la

mli

mon

m

or

nau

n

ep

net

n

ic

nig

n

ir

n

or

nze

p

ak

pan

p

ar

per

p

hi

pol

p

or

rum

r

wa

saf

s

au

sen

s

ie

som

s

pa

sri

sud

s

we

sw

i

s

yr

tai

tha

tog

tri

tun

tur

uga

u

ki

upv

u

ru

usa

u

sr

ven

v

nd

vnr

w

ge

yem

y

ug

zai

Pajek - shadow 0.00,1.00 Sep- 5-1998World Trade (Snyder and Kick, 1979) - cores

uki net bel lux fra ita den jap usa can bra arg ire swi spa por wge ege pol aus hun cze yug gre bul rum usr fin swe nor irn tur irq egy leb cha ind pak aut cub mex uru nig ken saf mor sud syr isr sau kuw sri tha mla gua hon els nic cos pan col ven ecu per chi tai kor vnr phi ins nze mli sen nir ivo upv gha cam gab maa alg hai dom jam tri bol par mat alb cyp ice dah nau gui lib sie tog car chd con zai uga bur rwa som eth tun liy jor yem afg mon kod brm nep kmr lao vnd

uki

n

et

bel

lu

x

fr

a

it

a

d

en

jap

usa

c

an

bra

a

rg

ire

sw

i

s

pa

por

w

ge

ege

p

ol

aus

h

un

cze

y

ug

gre

b

ul

rum

u

sr

fin

sw

e

n

or

irn

tur

irq

egy

le

b

c

ha

ind

pak

a

ut

cub

m

ex

uru

n

ig

ken

s

af

mor

s

ud

syr

is

r

s

au

kuw

s

ri

th

a

m

la

gua

h

on

els

n

ic

cos

p

an

col

v

en

ecu

p

er

chi

ta

i

k

or

vnr

p

hi

ins

nze

m

li

s

en

nir

ivo

upv

g

ha

cam

g

ab

maa

a

lg

hai

d

om

jam

tr

i

b

ol

par

m

at

alb

c

yp

ice

dah

n

au

gui

li

b

s

ie

tog

car

c

hd

con

z

ai

uga

b

ur

rw

a

s

om

eth

tu

n

li

y

jo

r

y

em

afg

m

on

kod

b

rm

nep

k

mr

lao

vnd

Alphabetic order of countries (left) and rearrangement (right)

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 4: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 2'

&

$

%

Blockmodeling as a clustering problem

The goal of blockmodeling is to reducea large, potentially incoherent networkto a smaller comprehensible structurethat can be interpreted more readily.Blockmodeling, as an empirical proce-dure, is based on the idea that units ina network can be grouped according tothe extent to which they are equivalent,according to some meaningful defini-tion of equivalence.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 5: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 3'

&

$

%

Cluster, clustering, blocks

One of the main procedural goals of blockmodeling is to identify, in a givennetwork N = (U , R), R ⊆ U × U , clusters (classes) of units that sharestructural characteristics defined in terms of R. The units within a clusterhave the same or similar connection patterns to other units. They form aclustering C = {C1, C2, . . . , Ck} which is a partition of the set U . Eachpartition determines an equivalence relation (and vice versa). Let us denoteby ∼ the relation determined by partition C.

A clustering C partitions also the relation R into blocks

R(Ci, Cj) = R ∩ Ci × Cj

Each such block consists of units belonging to clusters Ci and Cj and allarcs leading from cluster Ci to cluster Cj . If i = j, a block R(Ci, Ci) iscalled a diagonal block.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 6: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 4'

&

$

%

Blockmodel

A blockmodel consists of structures obtained by identifying all units fromthe same cluster of the clustering C. For an exact definition of a blockmodelwe have to be precise also about which blocks produce an arc in the reducedgraph and which do not. The reduced graph can be presented by relationalmatrix, called also image matrix.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 7: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 5'

&

$

%

Structural and regular equivalence

Regardless of the definition of equivalence used, there are two basicapproaches to the equivalence of units in a given network (compare Faust,1988):

• the equivalent units have the same connection pattern to the sameneighbors;

• the equivalent units have the same or similar connection pattern to(possibly) different neighbors.

The first type of equivalence is formalized by the notion of structuralequivalence and the second by the notion of regular equivalence with thelatter a generalization of the former.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 8: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 6'

&

$

%

Structural equivalence

Units are equivalent if they are connected to the rest of the network inidentical ways (Lorrain and White, 1971). Such units are said to bestructurally equivalent.

In other words, X and Y are structurally equivalent iff:

s1. XRY ⇔ YRX s3. ∀Z ∈ U \ {X,Y} : (XRZ ⇔ YRZ)s2. XRX ⇔ YRY s4. ∀Z ∈ U \ {X,Y} : (ZRX ⇔ ZRY)

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 9: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 7'

&

$

%

. . . structural equivalence

The blocks for structural equivalence are null or complete with variationson diagonal in diagonal blocks.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 10: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 8'

&

$

%

Regular equivalence

Integral to all attempts to generalize structural equivalence is the idea thatunits are equivalent if they link in equivalent ways to other units that arealso equivalent.

White and Reitz (1983): The equivalence relation ≈ on U is a regularequivalence on network N = (U , R) if and only if for all X,Y,Z ∈ U ,X ≈ Y implies both

R1. XRZ ⇒ ∃W ∈ U : (YRW ∧W ≈ Z)R2. ZRX ⇒ ∃W ∈ U : (WRY ∧W ≈ Z)

Another view of regular equivalence is based on colorings (Everett, Borgatti1996).

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 11: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 9'

&

$

%

. . . regular equivalence

Theorem 1 (Batagelj, Doreian, Ferligoj, 1992) Let C = {Ci} be apartition corresponding to a regular equivalence ≈ on the network N =(U , R). Then each block R(Cu, Cv) is either null or it has the propertythat there is at least one 1 in each of its rows and in each of its columns.Conversely, if for a given clustering C, each block has this property thenthe corresponding equivalence relation is a regular equivalence.

The blocks for regular equivalence are null or 1-covered blocks.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 12: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 10'

&

$

%

An Example

r r r rr r

rr

JJ

J

JJ

J�

���

ZZ

ZZ

1

2

3

4 5

6

7 8

For the relation R given with the graph in figure the equivalences mentionedare

≈str = {{4, 5}, {7, 8}, {1}, {2}, {3}, {6}}≈reg = {{1, 4, 5, 7, 8}, {2, 3, 6}}

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 13: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 11'

&

$

%

Establishing Blockmodels

The problem of establishing a partition of units in a network in terms of aselected type of equivalence is a special case of clustering problem that canbe formulated as an optimization problem (Φ, P ) as follows:

Determine the clustering C? ∈ Φ for which

P (C?) = minC∈Φ

P (C)

where Φ is the set of feasible clusterings and P is a criterion function.

Since the set of units U is finite, the set of feasible clusterings is alsofinite. Therefore the set Min(Φ, P ) of all solutions of the problem (optimalclusterings) is not empty.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 14: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 12'

&

$

%

Criterion function

Criterion functions can be constructed

• indirectly as a function of a compatible (dis)similarity measure betweenpairs of units, or

• directly as a function measuring the fit of a clustering to an idealone with perfect relations within each cluster and between clustersaccording to the considered types of connections (equivalence).

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 15: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 13'

&

$

%

Indirect Approach

−→−→

−−−−→−→

R

Q

D

hierarchical algorithms,

relocation algorithm, leader algorithm, etc.

RELATION

DESCRIPTIONSOF UNITS

original relation

path matrix

triadsorbits

DISSIMILARITYMATRIX

STANDARDCLUSTERINGALGORITHMS

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 16: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 14'

&

$

%

Dissimilarities

We consider the following list of dissimilarities between units xi and xj

where the description of the unit consists of the row and column of theproperty matrix Q = [qij ].

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 17: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 15'

&

$

%

. . . Dissimilarities

Manhattan distance: dm(xi, xj) =n∑

s=1

(|qis − qjs|+ |qsi − qsj |)

Euclidean distance:

dE(xi, xj) =

√√√√ n∑s=1

((qis − qjs)2 + (qsi − qsj)2)

Truncated Manhattan distance: ds(xi, xj) =n∑

s=1s 6=i,j

(|qis − qjs|+ |qsi − qsj |)

Truncated Euclidean distance (Faust, 1988):

dS(xi, xj) =

√√√√√ n∑s=1

s 6=i,j

((qis − qjs)2 + (qsi − qsj)2)

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 18: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 16'

&

$

%

. . . Dissimilarities

Corrected Manhattan-like dissimilarity (p ≥ 0):

dc(p)(xi, xj) = ds(xi, xj) + p · (|qii − qjj |+ |qij − qji|)

Corrected Euclidean-like dissimilarity (Burt and Minor, 1983):

de(p)(xi, xj) =√

dS(xi, xj)2 + p · ((qii − qjj)2 + (qij − qji)2)

Corrected dissimilarity:

dC(p)(xi, xj) =√

dc(p)(xi, xj)

The parameter, p, can take any positive value. Typically, p = 1 or p = 2,where these values count the number of times the corresponding diagonalpairs are counted.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 19: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 17'

&

$

%

. . . DissimilaritiesIt is easy to verify that all expressions from the list define a dissimilarity(i.e. that d(x, y) ≥ 0; d(x, x) = 0; and d(x, y) = d(y, x)). Each of thedissimilarities from the list can be assessed to see whether or not it is also adistance: d(x, y) = 0 ⇒ x = y and d(x, y) + d(y, z) ≥ d(x, z).

The dissimilarity measure d is compatible with a considered equivalence ∼if for each pair of units holds

Xi ∼ Xj ⇔ d(Xi, Xj) = 0

Not all dissimilarity measures typically used are compatible with structuralequivalence. For example, the corrected Euclidean-like dissimilarity iscompatible with structural equivalence.

The indirect clustering approach does not seem suitable for establishingclusterings in terms of regular equivalence since there is no evident wayhow to construct a compatible (dis)similarity measure.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 20: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 18'

&

$

%

The Hierarchical Approach

Agglomerative hierarchical clustering algorithms usually assume that allrelevant information on the relationships between the n units from the set Uis summarized by a symmetric pairwise dissimilarity matrix D = [dij ]. Thescheme of the agglomerative hierarchical algorithm is:

Each unit is a cluster: Ci = {xi} , xi ∈ U , i = 1, 2, . . . , n;repeat while there exist at least two clusters:

determine the nearest pair of clusters Cp and Cq:d(Cp, Cq) = minu,v d(Cu, Cv) ;

fuse the clusters Cp and Cq to form a new clusterCr = Cp ∪ Cq;

replace Cp and Cq by the cluster Cr;determine the dissimilarities between the cluster Cr

and other clusters.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 21: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 19'

&

$

%

Dissimilarity update

hhhhhhhhh

���������

�sCj

sCi s Ckd(Ci, Cj)

d(Cj , Ck)

d(Ci, Ck)

According to the last step of this al-gorithm, we have to determine thedissimilarity d between the newlyformed cluster Cr and all other, pre-viously established, clusters. Thiscan be done in many different ways,each of which determines a differ-ent hierarchical clustering method.

The resulting clustering (hierarchy) can be represented graphically by theclustering tree (dendrogram).

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 22: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 20'

&

$

%

Methods

Suppose further, that the clusters Ci and Cj are the closest. They are fusedto form a new cluster Ci ∪ Cj . The methods of creating the dissimilaritybetween the new cluster and an extant cluster Ck include the following:

• The Minimum method, or single linkage, (Florek K. et al., 1951;Sneath P., 1957):

d(Ci ∪ Cj , Ck) = min(d(Ci, Ck), d(Cj , Ck))

• The Maximum method, or complete linkage, (McQuitty L., 1960):

d(Ci ∪ Cj , Ck) = max(d(Ci, Ck), d(Cj , Ck))

• The McQuitty method (McQuitty L., 1966; 1967):

d(Ci ∪ Cj , Ck) =d(Ci, Ck) + d(Cj , Ck)

2

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 23: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 21'

&

$

%

. . . Methods

• The Average method (Sokal R. and Michener C., 1958):

d(Ci ∪ Cj , Ck) =1

(ni + nj)nk

∑u∈Ci∪Cj

∑v∈Ck

d(u, v)

where ni denotes the number of units in the cluster Ci.

• The Gower method (Gower J., 1967):

d(Ci ∪ Cj , Ck) = d2(tij , tk)

where tij denotes the centroid of the fused cluster Ci ∪ Cj and tk thecenter of the cluster Ck.

• The Ward method (Ward, 1963):

d(Ci ∪ Cj , Ck) =(ni + nj)nk

(ni + nj + nk)d2(tij , tk)

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 24: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 22'

&

$

%

Example: Support network among informaticsstudents

The analyzed network consists of social support exchange relation amongfifteen students of the Social Science Informatics fourth year class(2002/2003) at the Faculty of Social Sciences, University of Ljubljana.Interviews were conducted in October 2002.

Support relation among students was identified by the following question:

Introduction: You have done several exams since you are in thesecond class now. Students usually borrow studying material fromtheir colleagues.

Enumerate (list) the names of your colleagues that you have mostoften borrowed studying material from. (The number of listedpersons is not limited.)

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 25: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 23'

&

$

%

Class network

b02

b03

g07

g09

g10

g12

g22 g24

g28

g42

b51

g63

b85

b89

b96

class.net

Vertices represent studentsin the class; circles – girls,squares – boys. Oppositepairs of arcs are replaced byedges.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 26: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 24'

&

$

%

Indirect approachPajek - Ward [0.00,7.81]

b51

b89

b02

b96

b03

b85

g10

g24

g09

g63

g12

g07

g28

g22

g42

Using Corrected Euclidean-likedissimilarity and Ward clusteringmethod we obtain the followingdendrogram.From it we can determine the num-ber of clusters: ’Natural’ cluster-ings correspond to clear ’jumps’ inthe dendrogram.If we select 3 clusters we get thepartition C.

C = {{b51, b89, b02, b96, b03, b85, g10, g24},{g09, g63, g12}, {g07, g28, g22, g42}}

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 27: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 25'

&

$

%

Partition in 3 clusters

b02

b03

g07

g09

g10

g12

g22 g24

g28

g42

b51

g63

b85

b89

b96

On the picture, ver-tices in the samecluster are of thesame color.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 28: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 26'

&

$

%

MatrixPajek - shadow [0.00,1.00]

b02

b03

g10

g24 C1

b51

b85

b89

b96

g07

g22 C2

g28

g42

g09

g12 C3

g63

b02

b03

g10

g24

C

1

b51

b85

b89

b96

g07

g22

C

2

g28

g42

g09

g12

C

3

g63

The partition can be usedalso to reorder rows andcolumns of the matrix repre-senting the network. Clus-ters are divided using bluevertical and horizontal lines.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 29: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 27'

&

$

%

Direct Approach and GeneralizedBlockmodeling

The second possibility for solving the blockmodeling problem is to constructan appropriate criterion function directly and then use a local optimizationalgorithm to obtain a ‘good’ clustering solution.

Criterion function P (C) has to be sensitive to considered equivalence:

P (C) = 0 ⇔ C defines considered equivalence.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 30: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 28'

&

$

%

Generalized BlockmodelingA blockmodel consists of structures ob-tained by identifying all units from thesame cluster of the clustering C. Foran exact definition of a blockmodel wehave to be precise also about whichblocks produce an arc in the reducedgraph and which do not, and of whattype. Some types of connections arepresented in the figure on the next slide.The reduced graph can be representedby relational matrix, called also imagematrix.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 31: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 29'

&

$

%

Block Types and Matrices

1 1 1 1 1 1 0 0

1 1 1 1 0 1 0 1

1 1 1 1 0 0 1 0

1 1 1 1 1 0 0 0

0 0 0 0 0 1 1 1

0 0 0 0 1 0 1 1

0 0 0 0 1 1 0 1

0 0 0 0 1 1 1 0

C1 C2

C1 complete regular

C2 null complete

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 32: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 30'

&

$

%

Block Types

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 33: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 31'

&

$

%

Generalized equivalence / Block Types

Y1 1 1 1 1

X 1 1 1 1 11 1 1 1 11 1 1 1 1complete

Y0 1 0 0 0

X 1 1 1 1 10 0 0 0 00 0 0 1 0

row-dominant

Y0 0 1 0 0

X 0 0 1 1 01 1 1 0 00 0 1 0 1

col-dominant

Y0 1 0 0 0

X 1 0 1 1 00 0 1 0 11 1 0 0 0

regular

Y0 1 0 0 0

X 0 1 1 0 01 0 1 0 00 1 0 0 1

row-regular

Y0 1 0 1 0

X 1 0 1 0 01 1 0 1 10 0 0 0 0col-regular

Y0 0 0 0 0

X 0 0 0 0 00 0 0 0 00 0 0 0 0

null

Y0 0 0 1 0

X 0 0 1 0 01 0 0 0 00 0 0 1 0

row-functional

Y1 0 0 00 1 0 0

X 0 0 1 00 0 0 00 0 0 1

col-functional

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 34: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 32'

&

$

%

Characterizations of Types of Blocksnull nul all 0 ∗

complete com all 1 ∗

regular reg 1-covered rows and columns

row-regular rre each row is 1-covered

col-regular cre each column is 1 -covered

row-dominant rdo ∃ all 1 row ∗

col-dominant cdo ∃ all 1 column ∗

row-functional rfn ∃! one 1 in each row

col-functional cfn ∃! one 1 in each column

non-null one ∃ at least one 1∗ except this may be diagonal

A block is symmetric iff ∀X,Y ∈ Ci × Cj : (XRY ⇔ YRX).

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 35: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 33'

&

$

%

Criterion function

One of the possible ways of constructing a criterion function that directlyreflects the considered equivalence is to measure the fit of a clustering toan ideal one with perfect relations within each cluster and between clustersaccording to the considered equivalence.

Given a clustering C = {C1, C2, . . . , Ck}, let B(Cu, Cv) denote the set ofall ideal blocks corresponding to block R(Cu, Cv). Then the global error ofclustering C can be expressed as

P (C) =∑

Cu,Cv∈C

minB∈B(Cu,Cv)

d(R(Cu, Cv), B)

where the term d(R(Cu, Cv), B) measures the difference (error) betweenthe block R(Cu, Cv) and the ideal block B. d is constructed on the basis ofcharacterizations of types of blocks. The function d has to be compatiblewith the selected type of equivalence.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 36: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 34'

&

$

%

. . . criterion function

For example, for structural equivalence, the term d(R(Cu, Cv), B) can beexpressed, for non-diagonal blocks, as

d(R(Cu, Cv), B) =∑

X∈Cu,Y∈Cv

|rX Y − bX Y|.

where rX Y is the observed tie and bX Y is the corresponding value in anideal block. This criterion function counts the number of 1s in erstwhilenull blocks and the number of 0s in otherwise complete blocks. These twotypes of inconsistencies can be weighted differently.

Determining the block error, we also determine the type of the best fittingideal block (the types are ordered).

The criterion function P (C) is sensitive iff P (C) = 0 ⇔ µ (determinedby C) is an exact blockmodeling. For all presented block types sensitivecriterion functions can be constructed (Batagelj, 1997).

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 37: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 35'

&

$

%

Solving the blockmodeling problem

The obtained optimization problem can be solved by local optimization(e.g., relocation algorithm).

Determine the initial clustering C;repeat:

if in the neighborhood of the current clustering Cthere exists a clustering C′ such that P (C′) < P (C)then move to clustering C′ .

The neighborhood in this local optimization procedure is determined by thefollowing two transformations:

• moving a unit Xk from cluster Cp to cluster Cq (transition);

• interchanging units Xu and Xv from different clusters Cp and Cq

(transposition).

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 38: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 36'

&

$

%

Benefits from Optimization Approach• ordinary / inductive blockmodeling: Given a network N and set of

types of connection T , determine the model M;

• evaluation of the quality of a model, comparing different models,analyzing the evolution of a network (Sampson data, Doreian andMrvar 1996; states / continents): Given a network N , a model M, andblockmodeling µ, compute the corresponding criterion function;

• model fitting / deductive blockmodeling: Given a network N , set oftypes T , and a family of models, determine µ which minimizes thecriterion function (Batagelj, Ferligoj, Doreian, 1998).

• we can fit the network to a partial model and analyze the residualafterward;

• we can also introduce different constraints on the model, for example:units X and Y are of the same type; or, types of units X and Y are notconnected; . . .

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 39: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 37'

&

$

%

Pre-specified blockmodelingIn the previous slides the inductive approaches for establishing blockmodelsfor a set of social relations defined over a set of units were discussed.Some form of equivalence is specified and clusterings are sought that areconsistent with a specified equivalence.

Another view of blockmodeling is deductive in the sense of starting with ablockmodel that is specified in terms of substance prior to an analysis.

In this case given a network, set of types of ideal blocks, and a reducedmodel, a solution (a clustering) can be determined which minimizes thecriterion function.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 40: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 38'

&

$

%

Pre-Specified Blockmodels

The pre-specified blockmodeling starts with a blockmodel specified, interms of substance, prior to an analysis. Given a network, a set of idealblocks is selected, a family of reduced models is formulated, and partitionsare established by minimizing the criterion function.

The basic types of models are:

* * *

* 0 0

* 0 0

* 0 0

* * 0

? * *

* 0 0

0 * 0

0 0 *

center - hierarchy clustering

periphery

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 41: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 39'

&

$

%

Prespecified blockmodeling example

We expect that center-periphery model exists in the network: some studentshaving good studying material, some not.

Prespecified blockmodel: (com/complete, reg/regular, -/null block)

1 2

1 [com reg] -

2 [com reg] -

Using local optimization we get the partition:

C = {{b02, b03, b51, b85, b89, b96, g09},

{g07, g10, g12, g22, g24, g28, g42, g63}}

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 42: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 40'

&

$

%

2 clusters solution

b02

b03

g07

g09

g10

g12

g22 g24

g28

g42

b51

g63

b85

b89

b96

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 43: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 41'

&

$

%

ModelPajek - shadow [0.00,1.00]

g07

g10

g12

g22

g24

g28

g42

g63

b85

b02

b03

g09

b51

b89

b96

g07

g10

g12

g22

g24

g28

g42

g63

b85

b02

b03

g09

b51

b89

b96

Image and Error Matrices:

1 2

1 reg -

2 reg -

1 2

1 0 3

2 0 2

Total error = 5center-periphery

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 44: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 42'

&

$

%

The Student Government at the University of Ljubljana in1992

The relation is determined by the following question (Hlebec, 1993):

Of the members and advisors of the Student Government, whomdo you most often talk with about the matters of the StudentGovernment?

m p m m m m m m a a a1 2 3 4 5 6 7 8 9 10 11

minister 1 1 · 1 1 · · 1 · · · · ·p.minister 2 · · · · · · · 1 · · ·minister 2 3 1 1 · 1 · 1 1 1 · · ·minister 3 4 · · · · · · 1 1 · · ·minister 4 5 · 1 · 1 · 1 1 1 · · ·minister 5 6 · 1 · 1 1 · 1 1 · · ·minister 6 7 · · · 1 · · · 1 1 · 1minister 7 8 · 1 · 1 · · 1 · · · 1adviser 1 9 · · · 1 · · 1 1 · · 1adviser 2 10 1 · 1 1 1 · · · · · ·adviser 3 11 · · · · · 1 · 1 1 · ·

minister1

pminister

minister2

minister3

minister4

minister5

minister6

minister7

advisor1

advisor2

advisor3

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 45: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 43'

&

$

%

A Symmetric Acyclic Blockmodel of Student Government

The obtained clustering in 4clusters is almost exact. Theonly error is produced by thearc (a3,m5).

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 46: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 44'

&

$

%

Blockmodeling in 2-mode networksWe already presented some ways of rearranging 2-mode network matricesat the beginning of this lecture.

It is also possible to formulate this goal as a generalized blockmodelingproblem where the solutions consist of two partitions — row-partition andcolumn-partition.

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 47: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 45'

&

$

%

Supreme Court Voting for Twenty-Six Important DecisionsIssue Label Br Gi So St OC Ke Re Sc ThPresidential Election PE - - - - + + + + +

Criminal Law CasesIllegal Search 1 CL1 + + + + + + - - -Illegal Search 2 CL2 + + + + + + - - -Illegal Search 3 CL3 + + + - - - - + +Seat Belts CL4 - - + - - + + + +Stay of Execution CL5 + + + + + + - - -

Federal Authority CasesFederalism FA1 - - - - + + + + +Clean Air Action FA2 + + + + + + + + +Clean Water FA3 - - - - + + + + +Cannabis for Health FA4 0 + + + + + + + +United Foods FA5 - - + + - + + + +NY Times Copyrights FA6 - + + - + + + + +

Civil Rights CasesVoting Rights CR1 + + + + + - - - -Title VI Disabilities CR2 - - - - + + + + +PGA v. Handicapped Player CR3 + + + + + + + - -

Immigration Law CasesImmigration Jurisdiction Im1 + + + + - + - - -Deporting Criminal Aliens Im2 + + + + + - - - -Detaining Criminal Aliens Im3 + + + + - + - - -Citizenship Im4 - - - + - + + + +

Speech and Press CasesLegal Aid for Poor SP1 + + + + - + - - -Privacy SP2 + + + + + + - - -Free Speech SP3 + - - - + + + + +Campaign Finance SP4 + + + + + - - - -Tobacco Ads SP5 - - - - + + + + +

Labor and Property Rights CasesLabor Rights LPR1 - - - - + + + + +Property Rights LPR2 - - - - + + + + +

The Supreme Court Justices andtheir ‘votes’ on a set of 26 “impor-tant decisions” made during the2000-2001 term, Doreian and Fu-jimoto (2002).The Justices (in the order inwhich they joined the SupremeCourt) are: Rehnquist (1972),Stevens (1975), O’Conner (1981),Scalia (1982), Kennedy (1988),Souter (1990), Ginsburg (1993)and Breyer (1994).

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6

Page 48: Anuska Ferligojˇ Vladimir Batagelj University of Ljubljanavlado.fmf.uni-lj.si/pub/networks/doc/seminar/QMSS03c.pdfjam jap jor ken kmr kod kor kuw lao leb lib liy lux maa mat mex mla

A. Ferligoj, V. Batagelj: Blockmodeling 46'

&

$

%

. . . Supreme Court Voting / a (4,7) partition

Pajek - shadow

[0.00,1.00]

Cam

paignFi

VotingR

igh

DeportA

lie

IllegalS.1

IllegalS.2

StayE

xecut

PG

AvP

layer

Privacy

Imm

igratio

LegalAidP

o

Crim

inalAl

IllegalS.3

CleanA

irAc

Cannabis

NY

Tim

es

SeatB

elt

UnitedF

ood

Citizenshi

PresE

lecti

Federalism

FreeS

peech

CleanW

ater

TitleIV

Tobacco

LaborIssue

Property

Rehnquest

Thomas

Scalia

Kennedy

OConner

Breyer

Ginsburg

Souter

Stevens

upper – conservative / lower – liberal

ESF, QMSS Workshop, Ljubljana, July 2005 s s y s l s y ss * 6