1234 G 1 1 3 Exp 1 2 3 4 G So as not to duplicate axes, this copy of G should be folded over to...

9
1 2 3 4 G 1 1 3 E xp 1 2 3 4 G So as not to duplicate axes, this copy of G should be folded over to coincide with the other copy, producing a "conical" unipartite card. The Bipartite, Unipartite-on-Part Experiment Gene Relationship, EGG

Transcript of 1234 G 1 1 3 Exp 1 2 3 4 G So as not to duplicate axes, this copy of G should be folded over to...

Page 1: 1234 G 1 1 3 Exp 1 2 3 4 G So as not to duplicate axes, this copy of G should be folded over to coincide with the other copy, producing a "conical" unipartite.

1 2 3 4G

11

3

Exp

1

2

3

4

G

So as not to duplicate axes, this copy of G should be folded over to coincide with the other copy, producing a "conical" unipartite card.

The Bipartite, Unipartite-on-Part Experiment Gene Relationship, EGG

Page 2: 1234 G 1 1 3 Exp 1 2 3 4 G So as not to duplicate axes, this copy of G should be folded over to coincide with the other copy, producing a "conical" unipartite.

Customer

1

2

3

4

Item

76

54

32

t

1

6

5

4

3

Gene

11

1

Doc

1

2

3

4

Gene

11

3

Exp

11

11

11

11

1 2 3 4 Author

1 2 3 4 G 5 6term 7

5 6 7People

11

11

11

3

2

1

Doc

2 3 4 5PI

People

cust item card

authordoc card

termdoc card

docdoc card

(hyperlink anal.)

termterm card (share stem?)

expgene card

genegene card (ppi)

expPI card

Each axis, a, inherits a frequency attribute from each of its cards, c(a,b), denoted bf(c.a) "# of bs related to a" (e.g., df(t) = doc freq of term, t). Of course, bf(c.a) is inherited redundantly by c(a,b).

Each card, c(a,b), inherits a frequency attribute from each of its axes, a [b], denoted af(a,b)"# times a is related to b in c" [bf(a,b)"# times b~a in c"]

Each card, c(a,b), can be expanded by each of its axes, e.g., a, to a-sets (each a value is identified with the singleton, {a}) (e.g., itemsets in MBR) or a-sets can become a new axis (e.g., doc in IR. Note, if term is expanded by singleton termsets to be part of doc, then the termdoc card becomes a cone (see first slide)).

Next we put some of the descriptive attributse in their places.

Note: Conf / non-conf rules partition itemset-itemset card. Can we usefully list confident rules by specifying the boundary (SVM style)? That presuppose spatial continuity of conf rules (may not be correct assumption) but it may be on another similar card?

5

6

16

Item

Set

Supp(A) =CusFreq(ItemSet)

genegene card (ppi)

DataDex Model

ItemSet

antecedent

1 2 3 4 5 6 16

itemset itemset card

Conf(AB) =Supp(AB)/Supp(A)

Page 3: 1234 G 1 1 3 Exp 1 2 3 4 G So as not to duplicate axes, this copy of G should be folded over to coincide with the other copy, producing a "conical" unipartite.

 

 

Page 4: 1234 G 1 1 3 Exp 1 2 3 4 G So as not to duplicate axes, this copy of G should be folded over to coincide with the other copy, producing a "conical" unipartite.

Customer

1

2

3

4

Item

1

2

3

4

Gene

11

3

Exp

11

11

11

11

1 2 3 4

Author

1 2 3 4 5 6 7

5 6 7

People

cust itemset card

author doc

expgene card

genegene card (ppi)

expPI card

5

6

16

Item

Set

DataDex Model combiningterm doc and item itemset(no animation)

ItemSet (antecedent)

1 2 3 4 5 6 16

itemset itemset card

8 9 512

docter

mgene

PI

termterm

termdoc1

11

13

11

31

11

ItemSet can be replaced by ItemBag (allowing duplicates and promoting count analysis).

Page 5: 1234 G 1 1 3 Exp 1 2 3 4 G So as not to duplicate axes, this copy of G should be folded over to coincide with the other copy, producing a "conical" unipartite.

Customer

1

2

3

4

76

54

32

t

1

6

5

4

3

Gene

11

1

Doc

1

2

3

4

Gene

11

3

Exp

11

11

11

11

1 2 3 4 Author

1 2 3 4 G 5 6term 7

5 6 7People

11

11

11

3

2

1

Doc2 3 4 5PI

People

cust itembag card

authordoc card

termdoc card

docdoc card

termterm card (share stem?)

expgene card

genegene card (ppi)

expPI card

Item

Bag

genegene card (ppi)

DataDex uncombining term-doc and item-itemset (using itembag (basket) so item count in a basket is defined.

ItemBag

1 2 3 4

Item

5

6

5 6 ∞

itembag itembag card

What is term frequency? doc frequency?

1. TD is a bag-edged graph, i.e., Edge(TD) is a bag, meaning an edge can occur multiple times (the same term "can occur in" a doc many times). If we don't distinguish those occurrences other than existence (could distinguish them into type classes, e.g., verb, noun... ) then TD can be realized as a set-edged graph with a count label, otherwise we must use a bag-edged graph with a type label. Usually, TD is the former and the count label is term frequency.

Document frequency is a Term node label which is is the node degree (# of docs to which it relates).

A market basket is also a bag-edged graph which is realized as a set-edged graph with a count label.

Page 6: 1234 G 1 1 3 Exp 1 2 3 4 G So as not to duplicate axes, this copy of G should be folded over to coincide with the other copy, producing a "conical" unipartite.

3

4

5

6genegene card

Customer

1

2

3

4

Item

76

54

32

1

-90

:

.

90

11

1

Doc

Gene

11

3

Exp

11

11

11

11

1 2 3 4 Author

1 2 3 4 G 5 6term 7

5 6 7People

11

11

11

3

2

1

Doc2 3 4 5PI

People

authdoc card

termdoc card

docdoc card

termterm cardexpgene card

expPI card

5

6

16

ItemSet

DataDex Model

1 2 3 4 5 6 16

itemset itemset card

cust itemset card

exp loc card

Loc axis / card

Lat

axi

s

Lon axis

0 . . 360RSI card

Page 7: 1234 G 1 1 3 Exp 1 2 3 4 G So as not to duplicate axes, this copy of G should be folded over to coincide with the other copy, producing a "conical" unipartite.

3

4

5

6genegene card

Customer

1

2

3

4

Item

76

54

32

1

11

1

Doc

Gene

11

3

Exp

11

11

11

11

1 2 3 4 Author

1 2 3 4 G 5 6term 7

5 6 7People

11

11

11

3

2

1

Doc2 3 4 5PI

People

authdoc card

termdoc card

docdoc card

termterm card

expgene card

expPI

5

6

ItemBag

DataDex Model

1 2 3 4 5 6 ∞

itembagitembag card

cust itembag card

exp loc card

Loc (Lat-Lon)

1

2

3

4

5

6

Tim

e

RSI video

RSI card

11

1

Grn

d Im

age c

ard

(loc=

cam

era l

oc)

Ape

rture

an

gle a

xis

Grnd Video card

Page 8: 1234 G 1 1 3 Exp 1 2 3 4 G So as not to duplicate axes, this copy of G should be folded over to coincide with the other copy, producing a "conical" unipartite.

11

11Ex

p

11

11

11

11

term gene

People Author|Cust

People PI

Cust Itembag

AuthDoc

TermTerm

(GeneGene)

ExpGeneExp PI

Item

Bag

DocTerm Doc

Doc

Doc

ItembagItem

bag

Loc

Exp

Loc

LocIntensity

(Band)

Intensity

Page 9: 1234 G 1 1 3 Exp 1 2 3 4 G So as not to duplicate axes, this copy of G should be folded over to coincide with the other copy, producing a "conical" unipartite.

11

11Ex

p

11

11

11

11

term gene

People Author|Cust

People PI

Cust Itembag

AuthDoc

TermTerm

(GeneGene)

ExpGeneExp PI

Item

Bag

DocTerm Doc

Doc

Doc

ItembagItem

bag

Lat

ExpLoc (genes from specimen in lat)

Band (multispectral multitemporal)

Lon