EFFICIENT BOUNDS FOR 3D CAYLEY...

41
EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY CHITTAMURU A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2010 1

Transcript of EFFICIENT BOUNDS FOR 3D CAYLEY...

Page 1: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL2-TREES

By

UGANDHAR REDDY CHITTAMURU

A THESIS PRESENTED TO THE GRADUATE SCHOOLOF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

OF THE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCE

UNIVERSITY OF FLORIDA

2010

1

Page 2: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

c© 2010 Ugandhar Reddy Chittamuru

2

Page 3: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

To my father, Rajasekhar, mother, Aruna and sister, Greeshma

3

Page 4: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

ACKNOWLEDGMENTS

Before starting this work, I was holding on to assumptions that were detrimental for

research. I have my utmost respect towards Professor Meera Sitharam for her knowledge

acquisition and application skills, for showing remarkable patience and composure in

handling my inexperience and for stimulating my self-reflection into my assumptions.

Her advice and questions made this struggle fruitful. Exposure to her deductive process,

employed during problem-solving nurtured my thinking.

Knowledge obtained while resolving conflicting assumptions during discussions

on varied topics with Christopher Goddard, Ravi Teja Chinta, Siva Kumar Balaga,

Venkatakrishnan Ramaswamy, James Pence, on separate occasions, has influenced my

learning and thinking process. Teammates, Aysegul Ozkan and James Pence, were very

helpful in sharing their knowledge on this subject and especially answering my repetitive

questions. Aysegul Ozkan integrated the implementation of algorithm based on this work

into a visualizing tool for macro molecular assembly, co-developed by her.

Two OPS jobs, held at different points of time, offered by Dr. Nico Cellinese, Dr.

Amr Abd-Elrahma, provided the necessary financial support. Interaction with Dr. Alper

Ungor influenced my decision to pursue this field. Dr. Jorg Peter’s quote, simplify

knowledge by abstracting it i.e giving it structure, has opened up new perspectives. Dr.

Oscar Boykin was helpful when needed most. My uncles, Manohar and Narendranath,

have always assisted me in my career choices.

I am grateful to all these people for their influences, advices, and help.

4

Page 5: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

TABLE OF CONTENTS

page

ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

CHAPTER

1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.2 Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.3 Organization of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2 BACKGROUND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3 MACROMOLECULAR ASSEMBLY . . . . . . . . . . . . . . . . . . . . . . . . 16

3.1 Macromolecular Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . 163.2 Geometric Representation of Macromolecular Assembly . . . . . . . . . . . 16

3.2.1 Rigid Assembly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.2.2 Choosing Cayley Parameters . . . . . . . . . . . . . . . . . . . . . . 17

3.3 The Thesis Problem: Bounding Cayley Configuration Space . . . . . . . . 183.3.1 General Step-wise Sampling Algorithm . . . . . . . . . . . . . . . . 193.3.2 Other Related Questions . . . . . . . . . . . . . . . . . . . . . . . . 19

4 GRADATION IN COMPUTING BOUNDS FOR CAYLEY PARAMETERS IN3-TREES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.1 Non-edge Bounds by Polynomial Representation . . . . . . . . . . . . . . . 214.1.1 Linear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.1.2 Quadratic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.1.2.1 Non sharing non-edge . . . . . . . . . . . . . . . . . . . . 214.1.2.2 Single non-edge shared between multiple tetrahedra . . . . 21

4.1.3 Cubic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224.1.3.1 Two variables . . . . . . . . . . . . . . . . . . . . . . . . . 224.1.3.2 Three variables . . . . . . . . . . . . . . . . . . . . . . . . 234.1.3.3 Four, five, six variables . . . . . . . . . . . . . . . . . . . . 23

5 EFFICIENT METHODS FOR COMPUTING BOUNDS . . . . . . . . . . . . . 27

5.1 Choosing Non-sharing Non-edges for a Partial 2-tree to Extend to a 3-tree 275.1.1 Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275.1.2 Improved Sampling Algorithm . . . . . . . . . . . . . . . . . . . . . 29

5.1.2.1 Choosing non-edges to construct 3-tree . . . . . . . . . . . 295.1.2.2 Order of picking non-edges . . . . . . . . . . . . . . . . . . 29

5

Page 6: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

5.2 Characteristics of Non-edges That Fall into Linear Class When New Edgesare Added . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295.2.1 Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305.2.2 Application to Sampling . . . . . . . . . . . . . . . . . . . . . . . . 33

5.3 Linear Bounding Box for Partial 2-trees : A Necessary but Not SufficientCondition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335.3.1 Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345.3.2 Practical Application . . . . . . . . . . . . . . . . . . . . . . . . . . 37

6 CONCLUSION AND OPEN PROBLEM . . . . . . . . . . . . . . . . . . . . . 38

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

6

Page 7: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

LIST OF FIGURES

Figure page

1-1 Non-edge f computable by triangular inequalities . . . . . . . . . . . . . . . . . 12

1-2 Dotted edges (F ) are expressible by linear inequalities and G ∪ F is a 3-tree . . 12

3-1 Bi-Tether between mi and mj . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3-2 3 x 3 molecule active constraint graph by ’James Pence’ . . . . . . . . . . . . . 18

4-1 One non-edge shared between tetrahedrons . . . . . . . . . . . . . . . . . . . . . 25

4-2 Four paths between two vertices . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4-3 Cubic system with two variables (a) sharing two variables (b) sharing one variable 25

4-4 Cubic system with three variables (a)sharing three variable (b) sharing two variables(c) sharing one variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4-5 Cubic system with four variables (a) sharing three variable (b) sharing two variables(c) sharing one variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4-6 Cubic system with five variables (a) sharing three variables (b) sharing two variable 26

4-7 Cubic system with six variables, sharing at most three variables . . . . . . . . . 26

5-1 Non-sharing non-edge f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5-2 Component of S1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5-3 There exists an order where Step 1 solves linear inequalities while sampling theCayley configuration space consisting of the dotted edges . . . . . . . . . . . . 33

5-4 For every order, every iteration of Step 1 solves non-linear inequalities whilesampling the Cayley configuration space consisting of the dotted edges . . . . . 34

5-5 Non-edge f between Pi, Pj . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

7

Page 8: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

Abstract of Thesis Presented to the Graduate Schoolof the University of Florida in Partial Fulfillment of the

Requirements for the Degree of Master of Science

EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL2-TREES

By

Ugandhar Reddy Chittamuru

December 2010

Chair: Meera SitharamMajor: Computer Engineering

An Euclidean Distance Constraint System (EDCS) is defined as a graph, G(V, E),

with fixed or interval euclidean distance assignment to its edges. A non-edge is a

vertex pair that is not connected by an edge, and hence is not assigned a distance.

Each non-edge can take a single or range of values that is continuous or discontinuous.

The set of (squares of) values the non-edges can take is called the Cayley configuration

space of the EDCS. The set of (squares of) values the non-edges in a set F can take

is the Cayley configuration space of the EDCS projected on F . If G ∪ F is minimally

rigid, the parameterized Cayley configuration space taken on F is called complete Cay-

ley configuration space. A graph G is d-realizable if for every realization in some finite

euclidean dimension there is a realization with the same edge lengths in d-dimension.

Graphs sharing a complete-k-vertex graph are said to be in a k-sum. A k-tree is defined

recursively: a complete k + 1-vertex graph is a k-tree; and any two k-trees in a k-sum

together form a k-tree. The graph formed by any subset of edges of a k-tree is called a

partial k-tree.

From a result by Sitharam & Gao [1] we know, if a graph G together with a set of

non-edges F is d-realizable, then for any EDCS based on G, the d-dimensional Cayley

configuration space on F is convex. Partial 3-trees are a special subset of 3-realizable

graphs. A 3-tree is minimally rigid in 3D. There are known methods for computing

bounds in 3D for non-edges that extend partial 3-trees into complete 3-trees. These

8

Page 9: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

methods can involve solving system of inequalities consisting of polynomials of higher

degree ( > 2) with many terms and variables.

From [1], the inequalities that express the range of a non-edge can be classified into

either a linear or non-linear class. A sampling algorithm by nature adds edges to the

graph it is sampling. Sometimes a non-linear class non-edge after addition of certain edges

fall into a linear class.

Bounding box, a box enclosing the configuration space and touching its boundaries,

is a nice approximation for a configuration space. An EDCS can have different complete

Cayley configuration spaces varying in parameter set. Some of these complete Cayley

configuration spaces might have a bounding box expressible by linear inequalities.

In this thesis we

• give an iterative algorithm for computing bounds for complete Cayley configurationspaces of partial 2-trees (a subset of partial 3-trees) in 3D, where complexity ofcomputing each bound is either solving linear inequalities or a quadratic in singlevariable.

• prove a necessary but not sufficient condition to identify partial 2-trees that have abounding box expressible by linear inequalities.

• prove an exact characteristic of non-edges, in a partial 3-tree, that fall into a linearclass on addition of edges, such that the graph along with new edges is also a partial3-tree.

Problems in applications like molecular assembly have associated EDCS representations

and desired solution for these problems are usually a set or series of configurations.

Sampling or searching the configuration space of the underlying EDCS is one way to find

these configurations. In this thesis, we take macromolecular assembly as an example and

show how sampling or searching can be improved using the above algorithm.

9

Page 10: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

CHAPTER 1INTRODUCTION

1.1 Introduction

Overall, in this thesis we discuss an efficient method to improve sampling for a subset

of 3-realizable graphs defined as follows. Graphs sharing a complete-k-vertex graph are

said to be in a k-sum. A k-tree is defined recursively: a complete k + 1-vertex graph

is a k-tree; and any two k-trees in a k-sum together form a k-tree. The graph formed

by any subset of edges of a k-tree is called a partial k-tree. Some important facts about

k-trees relevant to this thesis are: (a) Partial 2-trees are the set of all 2-realizable graphs

(b) Partial 3-trees are subset of 3-realizable graphs. (c) Any k-tree is minimally rigid in

k-dimension.

Problems like molecular assembly, protein folding have a representation using EDCS.

Usually the solution for these problems are a set or series of configurations that are part

of the complete configuration space of the EDCS. One way is to search for the desired

configurations by sampling through the complete Cayley configuration space of the given

EDCS. The set of realizable distance assignments to the chosen parameters (non-edges)

yields a parameterized Cayley Configuration Space. If G ∪ F is minimally rigid, the

parameterized Cayley configuration space taken on F is a Complete Cayley Configuration

Space. Result from [1] promises an efficient structure for Cayley configuration space,

i.e squared convex, when EDCS along with non-edges results in a d-realizable graph in

d-dimension. From [1], the sampling complexity is a measure of efficiency for walking

through a configuration space. It depends on the complexity of (a) choosing the set of

non-edges, F and (b) the description of configuration space as semi-algebraic set.

The problems mentioned above have EDCS representations in 3D. In case of assembly,

majority of the graphs that occur are in fact 3-realizable. The configurations of interest

in molecular assembly problems are points in the configuration space that satisfies the

constraints. These configurations are crucial for a given assembly as they help to

10

Page 11: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

(a) determine the configurational entropy that in turn determines free energy;

(b) to isolate those intermolecular interactions that are crucial for driving assembly of acollection of molecular units.

Solving these problems would in turn shed light on robust and spontaneous - but poorly

understood - supramolecular and macromolecular self-assembly processes such as helix

packing, viral self-assembly, protein crystallization, prion aggregation, ligand and drug

docking etc.

Any efficient algorithm sampling a convex space will not step on points outside

of the space. Bounds on parameters are crucial in avoiding the un-populated space.

Computing description for a partial 3-trees is relatively easy. From the description of

convex space, finding bounds for parameters is a non-trivial problem. Finding bounds

using this description for non-edges in a 3-tree is of varying difficulty ranging from

solving system of linear inequalities to system of cubic inequalities. In this thesis, for

partial 2-trees we show a particular choice and order for non-edges can guarantee efficient

sampling where finding bounds for each non-edge is either solving a linear inequality or

solving a single quadratic of one variable.

From [1], inequalities that express the range of a non-edge can be classified into either

a linear or non-linear class. By nature, a sampling algorithm adds edges to the graph it is

sampling. In some graphs, though some non-edges fall into non-linear class, on addition of

edges (edges that withhold partial 3-tree property), the same non-edge can fall into linear

class. For instance, adding e1 to the graph in the Figure 1-1 results in a new partial 3-tree

and also makes f fall into linear class. A characterization for such non-edges can improve

sampling performance.

Bounding box, a box enclosing the configuration space and touching its boundaries, is

a nice approximation for the configuration space. Bounding box is used to approximately

validate a given configuration. Computing such a box is equivalent to finding the range

of all the non-edges that constitute the Cayley configuration space. Hence, complexity of

11

Page 12: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

e1

f

Figure 1-1. Non-edge f computable by triangular inequalities

finding a bounding box is related to the complexity of finding the range for non-edges.

From [1] we know a characteristic of linear class of non-edges, i.e non-edges expressible

by a collection of linear inequalities. Graphs with a bounding box expressible by linear

inequalities have a fast algorithm to compute the bounding box. So a characterization that

can identify such graphs is crucial. For the graph in the Figure 1-2, for each dotted-edge

all the minimum 2-sum components containing that dotted-edge are partial 2-trees. This

proves the existence of a bounding box expressible by linear inequalities for a specific

partial 2-tree in 3D.

Figure 1-2. Dotted edges (F ) are expressible by linear inequalities and G ∪ F is a 3-tree

12

Page 13: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

1.2 Questions

Motivated by the promising benefits an efficient complete sampling of configuration

space can bring to the applications listed above, in this thesis we address the following

questions.

1. What are the various levels of difficulty in computing bounds of non-edges in partial3-trees, using known methods ?

2. Given a partial 2-tree G, what choice and order of non-edges exists for G, so thatfinding bounds iteratively is efficient ?

3. What are the characteristics of partial 2-trees that have a bounding box expressibleby linear inequalities ?

4. What are the characteristics of non-edges, in a partial 3-tree, that switch into linearclass on addition of edges, such that the graph along with new edges is also a partial3-tree ?

1.3 Organization of Thesis

Chapter 2 gives basic definitions. Chapter 3 introduces to Macromolecular assembly,

details its geometric interpretation, discusses questions to be resolved for efficient sampling

which is necessary for visualization of molecular interaction in an assembly. Chapter 4 and

5 are the main contributions of this thesis. Chapter 4 shows gradation in complexity of

finding bounds using the description of configuration space. Chapter 5 shows

• a systematic choice and order for non-edges that can guarantee efficient sampling forpartial 2-trees in 3D

• a class of partial 2-trees for which a bounding box expressible by linear inequalitiescannot exist in 3D

• characteristics of non-edges in a partial 3-tree that switch into linear class onaddition of edges

Chapter 6 concludes with an open problem.

13

Page 14: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

CHAPTER 2BACKGROUND

This chapter provides definitions necessary to understand the issues addressed in this

thesis.

Definition 1.

(a) A metric space is a set where distance between elements in the set is defined asfollows:

– d(x, y) = 0 ⇔ x = y

– d(x, z) ≤ d(x, y) + d(z, y)

(b) Eucledian space is a metric space with eucledian metric.

(c) Eucledian metric is the distance relation between two points. If p(p1, p2), q(q1, q2) arethe points in Euclidean 2-space then eucledian metric is by the following formula.√

(p1 − q1)2 + (p2 − q2)2

(d) A point q is between two points p and r of the metric space if and only if p 6= q 6= r ;pq + qr = pr. [2]

(e) Subset of a metric space is convex if for every two points there exists at least onebetween point. [2]

Definition 2. Eucledian Distance Constraint System is a graph with distance assigned to

the edges. These distance assignments can be exact values or ranges. G = (V,E, δ), where

δ : E −→ R being a distance assignment to edges.

Definition 3. A graph G is d-realizable, if for every δ for which there exists a realization

in some finite Rk, for some k, there exists a realization in Rd. [3]

Definition 4.

(a) Graphs sharing a complete-k-vertex graph are said to be in a k-sum.

(b) A k-tree is defined recursively: a complete k + 1-vertex graph is a k-tree; and anytwo k-trees in k-sum form a k-tree.

(c) If inverse operations of k-sum cannot be run for a component, then that componentis a minimal k-sum component.

14

Page 15: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

(d) For a non-edge f , a minimal k-sum component containing f is minimal sub graphthat is both a k-sum component and contains the vertices of f .

Definition 5. For a given EDCS

(a) If there exists finitely many realizations then it is called rigid. [4]

(b) If removal of any edge from G results in non-rigid property then G is minimally rigidor well-constrained

Definition 6.

(a) Configuration Space of EDCS is set of configurations where each configurationrepresents a unique set of distance values corresponding to all vertex pairs or set ofall valid realizations for a given (G,δ).

(b) The set of realizable distance assignments to the chosen parameters (non-edges)yields a parameterized Cayley Configuration Space. [1]

(c) If G ∪ F is minimally rigid, the parameterized Cayley configuration space taken on Fis Complete Cayley Configuration Space .

(d) Configuration space in d-dimension for a graph G with F non-edges is represented asφd

F (G, δ).

(e) Convex space consisting of points that are squared is squared convex space.

Definition 7. Efficiency of Sampling Complexity depends on complexity of

• computing the choice of Cayley parameters F

• semi-algebraic representation of configuration spaces i.e complexity of polynomialinequalities i.e degree, variables, terms

[1]

15

Page 16: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

CHAPTER 3MACROMOLECULAR ASSEMBLY

This chapter introduces macro molecular assembly and gives a geometrical interpretation

using configuration spaces. We use recent advances in characterizing graphs with special

types of configuration spaces on the question of sampling, searching and visualization of

macromolecular assembly configuration spaces.

3.1 Macromolecular Assembly

Intermolecular interaction between collection of molecular units are subject to

constraints like weak forces, hydrogen bonds, steric collision-avoidance constraints,

tethering constraints. Two molecules mi,mj are bi-tether constrained, if two atom pairs,

(ai1, ai2), (bj1, bj2), each consisting of intersecting atoms and each from different molecule

touch each other as shown in the Figure 3-1. Molecular units together with constraints is

input to an assembly problem. Any valid realization satisfying all the constraints is called

an assembly. A complete understanding of valid molecular configurations that satisfy

the constraints is necessary to understand viral self-assembly, protein crystallization,

prion aggregation, ligand and drug docking etc. So tools that can help analyze, visualize

all configurations of interacting molecular units are necessary. In an assembly, a tree of

tethering constraints hold together the molecular units.

3.2 Geometric Representation of Macromolecular Assembly

Each molecule is a 3D rigid body consisting of atoms. A molecular assembly

constraint system has nodes representing atomic units within rigid molecules and edges

representing the distance intervals between the two atomic units. By selecting non-edges

between atom pairs such that the molecular pair becomes rigid, we make the whole

assembly rigid. By mapping tether constrained atoms to vertices and tether distance as

edges, a molecular pair in an assembly can be given an EDCS. Due to other atoms in the

molecule that enforce collision avoidance, the configuration space of EDCS given only by

the tethers is not the complete desired solution. To demonstrate geometric representation

16

Page 17: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

mi

m j

a

a

b

b

j1

i1

i2

j2

Figure 3-1. Bi-Tether between mi and mj

of molecular assembly, in the following sections, we consider molecular assembly consisting

of two molecules with tether constraints. We elaborate further details in the next section.

3.2.1 Rigid Assembly

A rigid body in 3D has at most 6 degrees of freedom, 3 translational and 3 rotational.

Each molecule is rigid and hence has 6 degrees of freedom. A tether constrained molecular

assembly results in a tether tree as EDCS in 3D. For an EDCS based on a graph G in

3D, a complete configuration space can be constructed only if G ∪ F where F is the set

of Cayley parameters of a configuration space forms a rigid graph in 3D. For efficient

sampling, according to [1], a squared convex configuration space can be constructed, if

G ∪ F is 3-realizable in 3D. A pair of tether constraints between two molecules removes

2 degree of freedom. For an assembly with two molecular units acted upon by a pair of

tether constraints at least 4 more parameters are necessary to become one rigid body.

3.2.2 Choosing Cayley Parameters

A 3-tree is rigid and 3-realizable in 3D. So if G ∪ F forms a 3-tree then Cayley

configuration space would be squared convex. Choosing parameters to make G ∪ F a

17

Page 18: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

3-tree is a graph problem. Only way to make G ∪ F 3-tree is to involve two more atoms,

each belonging to a different molecule, in addition to atoms tied with tether constraints.

For the graph in the Figure 3-2, we can see many ways to choose parameters between 6

atoms to form 3-tree. Let F constitute all the parameters chosen between all tether tied

molecular units. The Figure 3-2 is an active constraint graph with non-edge parameters

for an assembly with two molecular units with 3 atoms each.

Figure 3-2. 3 x 3 molecule active constraint graph by ’James Pence’

3.3 The Thesis Problem: Bounding Cayley Configuration Space

Sampling the configuration space is to systematically step through configurations

that have a feasible realization in Cartesian co-ordinate space satisfying the constraints.

The sampling step-size is an input parameter. Knowledge of parameter boundaries is

very crucial in devising any algorithm for sampling a convex space. Though convex

optimization methods can be used on the description of configuration space to find

parameter boundaries, for reasons discussed in subsequent chapter, it is preferable to use

better methods if they exist.

18

Page 19: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

3.3.1 General Step-wise Sampling Algorithm

Below is a sampling algorithm to scan through the configuration space of G, step

wise.

SamplingByStepSize((G, δ), F , step size )

1: pick a non-edge f from F , and compute its range, r.

2: while k = min(r) to max(r) do

3: fix f = k in (G, δ), refer the augmented graph as G′ = (G ∪ f, (δ, k)).

4: SamplingByStepSize(G′, F − f , step size)

5: k ← k + step size

6: end while

Using known methods, as shown in next chapter, for 3-trees range of any non-edge

can be obtained by solving system of inequalities of at most degree 3. So any convex

programming methods employed to solve for bounds of these parameters would be

complicated. Step 1 of the above algorithm would end up performing such computations.

This leads to the question, Is there a way to avoid such expensive computations while

sampling ? Chapter 4 addresses this question.

3.3.2 Other Related Questions

In the Step 3 of sampling algorithm discussed above, non-edge is made an edge and

sampling is called again on the new graph. When sampling the graph in Figure 1-1, if

e1 is the last non-edge to become an edge then for any order chosen, every execution of

Step 1 solves only non-linear inequalities. Earlier argument showed, there exists an order

where at least for one of the non-edge, Step 1 solves linear inequalities. A particular order

is beneficial to some non-edges as this would let them fall into linear class. Characteristics

that can identify such non-edges would contribute to improved sampling. This leads to

the question, what are the characteristics of non-edges, in a partial 3-tree, that switch into

linear class on addition of edges, such that the graph along with new edges is also a partial

3-tree ? In Chapter 4, we prove an exact characterization for these non-edges.

19

Page 20: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

A good approximation to a configuration space can lead to an efficient algorithm that

can validate a configuration. Bounding box is one such approximation for a configuration

space. Bounding box expressible by linear inequalities has a faster description algorithm.

An EDCS can have many complete Cayley configurations. Some of these configurations

might have a bounding box expressible by linear inequalities. For instance, the graph in

Figure 1-2 has a bounding box expressible by linear inequalities for the configuration

space consisting of the dotted edges as parameters. But we don’t know a way to identify a

graph with such a bounding box. This leads to the question, what are the characteristics

of a graph that has a bounding box expressible by linear inequalities ? In Chapter 4, we

partially answer the negative of this question by giving a characteristic for partial 2-trees

that cannot have a bounding box expressible by linear inequalities.

20

Page 21: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

CHAPTER 4GRADATION IN COMPUTING BOUNDS FOR CAYLEY PARAMETERS IN 3-TREES

As discussed in the previous chapter, finding non-edge bounds is an important step

for sampling configuration spaces. In the next section we provide an algorithm that

computes efficient bounds for non-edges. To demonstrate complexity of the algorithm, in

this section, we illustrate a gradation in complexity of computing straight forward bounds

for non-edges in partial 3-trees.

4.1 Non-edge Bounds by Polynomial Representation

From [1], the Cayley configuration space of any 3-realizable graph in 3D is squared

convex. 3-trees are 3-realizable and so the range of a non-edge by [1] should be a

continuous interval in squared convex space.

4.1.1 Linear

In fact as [1] observes for a non-edge f , if all the minimal 2-sum components of

G ∪ f containing f are partial 2-trees then range of f can be expressed as a set of linear

inequalities.

4.1.2 Quadratic

4.1.2.1 Non sharing non-edge

In Figure 4-1(a) each tetrahedron has 5 edges and one non-edge f . Bounds on f can

be computed by solving a quadratic in one variable. The maximum or minimum value a

non-edge in Figure 4-1(a) can take, is when the tetrahedron is of zero volume. In squared

distance space either equating the volume determinant of a tetrahedron to zero or using

the equation of a circle to find the co-ordinates of the two points between which are the

bounds we are interested in results in a quadratic in one variable.

4.1.2.2 Single non-edge shared between multiple tetrahedra

In Figure 4-1(b) bounds for a single non-edge shared between multiple tetrahedrons

is computable by solving system of quadratic inequalities in a single variable. These

21

Page 22: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

quadratics are obtainable by setting each tetrahedrons volume to be non-negative. The

resulting bounds are attained when at least one of the quadratics vanishes.

4.1.3 Cubic

Fact 1. If a graph G is 3-tree then any two non-adjacent vertices have exactly 3 vertex-

disjoint paths between them.

In all the figures referred in this section, we use the above fact and generate partial

3-trees, G. In all these partial 3-trees, some are edges and rest are non-edges, F . These

partial 3-trees are constructed (described below) such that all the non-edges in F should

be part of every 3-tree formed from G. So bounds for all the non-edges in the figures

referred in this section are to be computed to sample a complete configuration space of the

corresponding graph. Restating the above fact, if any two vertices (u, v) in a partial 3-tree

have more than 3 vertex disjoint paths, then every 3-tree for that graph must contain the

non-edge (u, v). We use this principle to construct series of graphs where we show varying

complexity of straight forward bounds for non-edges.

The graph in Figure 4-2 including the dotted edge is a partial 2-tree. Hence the

resulting graph after 2-sum with tetrahedron, T , along the dotted edge remains a partial

3-tree. Now remove the dotted edge. Refer to this graph as G. From the fact above, a

3-tree can be extended from G only if non-edge exist between the removed dotted edge

vertex pair. Repeat the above steps for each remaining edge of T . This way we can obtain

a tetrahedron with 0 to 6 dotted edges. In all the figures below, a dotted line represents

2-sum with Figure 4-2.

4.1.3.1 Two variables

In Figure 4-3 (a), take a tetrahedron similar to the one in (a) and run a 3-sum on the

corresponding shared triangle of (a). Repeat this until there are n tetrahedrons. Repeat

the same for (b). The maximum or minimum value for any non-edge happens when the

tetrahedron has 0 volume. Equating the volume determinant to zero for two parameters (x

and y) results in a cubic inequality in two variables.

22

Page 23: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

If each tetrahedron in Figure 4-3 has

(a) non-adjacent non-edges, then the determinant contains the terms (x2, y2, xy, x, y).

(b) adjacent non-edges then the determinant contains the terms (x2y, y2x, xy, x, y).

As there are n tetrahedrons, we can obtain bounds for any non-edge by solving the

system of cubic inequalities. The number of non-edges that exist in the shared triangle is

exactly the number of variables shared between the system of inequalities.

4.1.3.2 Three variables

In Figure 4-4 (a), take a tetrahedron similar to the one in (a) and run 3-sum on the

corresponding shared triangle of (a). Repeat this until there are n tetrahedrons. Repeat

the same for (b), (c). A maximum or minimum value for any non-edge happens when the

tetrahedron has 0 volume. Assuring non-negativity of the the volume determinant for

three parameters (x, y, z) results in a cubic inequalities in three variables.

If each tetrahedron in Figure 4-4 has

(a) all the non-edges adjacent to one another, then the determinant contains the terms(x2, y2, z2, xy, yz, zx, zy, x, y, z)

(b) at least one non-edge is not adjacent to an other, then the determinant contains theterms (x2z, xz2, xyz, y2, xy, xz, yz, x, y, z).

As there are n tetrahedrons, we can obtain bounds for any non-edge by solving at most n

cubic inequalities. The number of non-edges that exist in the shared triangle is exactly the

number of variables shared between the system of inequalities.

4.1.3.3 Four, five, six variables

Figures 4-5 (a, b, c), 4-6 (a, b), 4-7 have similar cubic system with 4, 5, 6 variables

respectively.

For Figure 4-5, if each tetrahedron has two

(a) non-adjacent edges, then the determinant contains the terms (vy2, wx2, xw2, yv2, wxv, vyw, vyx, wyx, vw, xy, yw, yv, wx, x, y, v, w,).

(b) adjacent edges, then the determinant contains the terms (xy, wyx, xw2, xv, yv, wv, wx2,wxv, xyv, v2, y2, yw, wx, v, x, y).

23

Page 24: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

For Figure 4-6, the determinant contains the terms (vyx, wxv, vy2, wx2, wyx, vyw, yv2,

xw2, vw, vy, wx, xv, xz, yz, wxz, xyz, z2, wz, z, vyz, vwz, vz)

For Figure 4-7, the determinant contains the terms (vyw,wxv, vy2, wx2, wyx, vyw, yv2,

xw2, wxz, xyz, uxz, uyz, vuz, wuz, uz2, u2z, uyw, uyv, wux, uxv, vyz, vwz)

24

Page 25: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

a b

Figure 4-1. One non-edge shared between tetrahedrons

Figure 4-2. Four paths between two vertices

a b

Figure 4-3. Cubic system with two variables (a) sharing two variables (b) sharing onevariable

a b c

Figure 4-4. Cubic system with three variables (a) sharing three variable (b) sharing twovariables (c) sharing one variable

25

Page 26: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

a b c

Figure 4-5. Cubic system with four variables (a) sharing three variable (b) sharing twovariables (c) sharing one variable

ba

Figure 4-6. Cubic system with five variables (a) sharing three variables (b) sharing twovariable

Figure 4-7. Cubic system with six variables, sharing at most three variables

26

Page 27: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

CHAPTER 5EFFICIENT METHODS FOR COMPUTING BOUNDS

This chapter addresses the questions raised in Chapter 2.

5.1 Choosing Non-sharing Non-edges for a Partial 2-tree to Extend to a3-tree

Definition 8. In a 3-tree, an edge that does not participate in any 3-sum is a non-sharing

edge. A non-sharing edge in a 3-tree belongs to a unique tetrahedron.

5.1.1 Theorems

Lemma 5.1.1. If a triangle (va, v2, vb) in a 3-tree, G (Figure 5-1) is 2-sum with a

triangle t, then there exists a non-edge, f , where G ∪ t ∪ f is a 3-tree and,

(a) f is not part of any 3-sum (shared triangle) in G ∪ t ∪ f

(b) all the edges or non-edges with non-sharing property in G except (va, vb), (va, v2),(vb, v2) are preserved in G ∪ t ∪ f .

(c) the tetrahedron f corresponds to has only one non-edge, i.e itself

Proof. Let (va, vb) be the 2-sum edge shared between t and G. Let v1 be the vertex

in triangle t not sharing a vertex with (va, vb). Similarly v2 is the vertex in triangle

(va, v2, vb) not sharing a vertex with (va, vb). Joining v1, v2 with non-edge f results

in tetrahedron, (v1, va, v2, vb). Refer Figure 5-1. G is a 3-tree. (v1, va, v2, vb) is a

tetrahedron. Both of these share the 3-sum triangle (va, v2, vb), which is a 3-sum. Hence

G ∪ t ∪ f is a 3-tree. In the above construction, the edges shared in G ∪ t ∪ f are only (va,

vb), (va, v2), (vb, v2). Hence the non-sharing property for the rest of the edges or non-edges

including f is still applicable.

Theorem 5.1.2. Given a graph G, if G is a 2-tree with at least 4 vertices, then there

exists non-edges, F , such that

(a) every non-edge in F has non-sharing property in 3-tree, G ∪ F

27

Page 28: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

v

v

f

v

v2

1

b

a

Figure 5-1. Non-sharing non-edge f

(b) The tetrahedron corresponding to each non-sharing non-edge in F has only onenon-edge i.e itself

Proof. The proof is by induction on number of vertices in the 2-tree. As G is a complete

2-tree and has at least 4 vertices, G should have at least two triangles.

Base case n = 4, only two triangles with unshared vertices vc and vx. This can be viewed

as a tetrahedron with non-edge f between vc and vx.

Induction step, assume hypothesis true for n = k.

Pick a leaf triangle, call it Vk+1. Refer to one triangle that is 2-sum with Vk+1 as Vk.

Without Vk+1, G− Vk+1 is a 2-tree with k vertices, denote it G′ . By induction hypothesis,

there exists a set F where G′ ∪ F is a 3-tree and every non-edge in F has non-sharing

property and the tetrahedron corresponding to every non-edge in F has a single non-edge,

namely itself. Vk+1 is in 2-sum with triangle, Vk, now part of G′ ∪ F . From Lemma 5.1.1,

there exists a non-edge f where G′ ∪ F ∪ f is a 3-tree, preserving non-sharing property

for all edges or non-edges in G′ ∪ F , except edges in Vk. Now f is the new non-sharing

non-edge in G′ ∪ F ∪ f . Call this new 3-tree, G. From Lemma 5.1.1 the tetrahedron f

corresponds to has only f as non-edge

28

Page 29: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

Theorem 5.1.3. There exists an algorithm to sample step wise the Cayley configuration

space of partial 2-trees in 3D where computing bounds is either solving linear inequalities

or a quadratic in one variable.

Proof. The below section gives an algorithm, proving this result.

5.1.2 Improved Sampling Algorithm

The sections below discusses on how to pick non-edges in Step 1 to avoid solving for a

system of inequalities with higher degree, terms, variables.

5.1.2.1 Choosing non-edges to construct 3-tree

For any partial 2-tree, we can choose non-edges F1 such that G ∪ F1 is a 2-tree. For

G ∪ F1, from Theorem 5.1.2, there always exists F2 such that G ∪ F1 ∪ F2 is a 3-tree and

every non-edge in F2 has non-sharing property.

5.1.2.2 Order of picking non-edges

In Step 1, pick a F2 only when none of F1’s are left. Fixing any subset of F1, say F ,

before F2 ensures G ∪ F as a partial 2-tree. From [1], the projection of configuration space

on any non-edge of F1 − F can be expressed as a collection of triangular inequalities. So

each F1 is computable by set of linear inequalities for any arbitrary order chosen for F1.

A non-sharing edge in a 3-tree corresponds to only one tetrahedron. As we made sure

any non-edge in F2 is fixed only after all of F1 is fixed, G ∪ F1 will be a 2-tree before F2

is fixed. As G ∪ F1 is 2-tree, by Theorem 5.1.2, each non-edge in F2, corresponds to an

unique tetrahedron with it being the only non-edge. So computing range for this non-edge

in F2 is as complex as solving Cayley determinant for one variable, which is a quadratic in

single variable.

5.2 Characteristics of Non-edges That Fall into Linear Class When NewEdges are Added

Definition 9. Linear class of non-edges consists of those non-edges whose range is

computable by linear inequalities.

29

Page 30: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

5.2.1 Theorems

Theorem 5.2.1. Given a partial 3-tree G ∪ f , if there exists a minimal 2-sum component

containing non-edge, f , having more than one vertex-disjoint path then for all edge sets x,

there exists a δ, an assignment of distances to edges of G ∪ x such that f is not expressible

by collection of triangular inequalities.

Proof. Denote a minimal 2-sum component containing f and having more than one

vertex-disjoint path as m. In m, pick any two vertex-disjoint paths, denote them as

P1, P2. Exclude vertices of f from P1, P2. Since m is a minimal 2-sum component with two

vertex-disjoint paths, there should be a path (P3) between P1, P2, not containing vertices

of f . As adding edges does not effect paths in G, every G ∪ x has P1, P2, P3. Reduction

discussed in proof of Theorem 5.2 in [1], when applied to G ∪ x results in one of the two

base cases discussed in Theorem 5.2. Assignment provided to the base cases as shown

in Lemma 5.5 of [1] proves existence of a δ where non-edge f cannot be computed by

triangular inequalities.

Theorem 5.2.2. Given a partial 3-tree G∪ f , if all minimal 2-sum components containing

non-edge, f , have only one vertex-disjoint path then there exists an edge set x, for all δ, an

assignment of distances to edges of G ∪ x such that

(a) G ∪ x is a partial 3-tree.

(b) f is expressible by collection of triangular inequalities.

Proof. Removal of v1 and v2 from G results in a set of connected component H1...HN .

Refer Gi as the graph induced by Hi together with v1 and v2. Gi ∪ f either has a minimal

2-sum component containing f that is not a partial 2-tree or has a minimal 2-sum

component containing f that is a partial 2-tree. Denote all the Gi ∪ f of former type by S1

and later by S2. In our notation, let the index of every component of S1 be less than any

component of S2. Proof is by induction on the number of components in S1.

30

Page 31: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

Base case

n = 0. With x as empty set, this case is true by Theorem 5.2 in [1]

n = 1. Refer to the component in S1 as G1. G1 is also a partial 3-tree as it is a 2-sum

component of G ∪ f . In G1 between v2, v1, there is only one vertex-disjoint path. Hence

there exists at least one articulation point, a1.

Adding edges to G1: if between vertices v1, a1, edge does not exist then construct one.

Repeat the same between v2, a1. Refer these newly constructed edges by set x. By 5.2.3,

G1 ∪ x is a partial 3-tree.

Edges in x are between vertices of G1. Hence components of S1 ∪ S2 are invariant

to addition of edge set x. From the above construction, a minimal 2-sum component

containing f in G1 ∪ x is a partial 2-tree. This proves all the minimal 2-sum components

of G ∪ f ∪ x containing f are partial 2-trees.

Induction step, true for n = k

S1 contains k + 1 components. Pick Gk+1 ∪ f from S1. v2, v1 in Gk+1 have only one

vertex-disjoint path. So there exists at least one articulation point, ak+1. Like base case,

n = 1, add edges to Gk+1. Refer them as xk+1. Edges in xk+1 are between vertices of Gk+1.

Hence S1 ∪ S2 are invariant to addition of edge set xk+1. From the above construction,

we know a minimal 2-sum component containing f in Gk+1 ∪ xk+1 ∪ f is a partial 2-tree.

This leaves only k number of components in S1. By induction hypothesis there exists a set

xk of edges such that each component in S1 results in a partial 2-tree. Refer xk+1 ∪ xk as

x. By Theorem 5.2 in [1], non-edge f in G ∪ x is computable by collection of triangular

inequalities.

Lemma 5.2.3. Given G ∪ f , a partial 3-tree and a minimal 2-sum component, if between

vertices of non-edge f , v1, v2, there exists an articulation vertex v3 such that at least one of

(v1, v3) or (v2, v3) does not exist in G then G ∪ f ∪ (v1, v3) ∪ (v2, v3) is a partial 3-tree.

31

Page 32: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

Proof. Any G satisfying the given conditions can be split into Ga, Gb, only sharing an

articulation point, v3, as shown in the Figure 5-2. Due to v3 being an articulation point,

no vertex of Ga has a path of edges to Gb without containing v3. Pick a path in Gb

connecting v2 to v3. Include v2, v3 into the path and call it P1.

G

a

v

v

1

2

3

b

G

v

Figure 5-2. Component of S1

If Gb does not have an edge, (v2, v3): In Gb, name the connected components that

results when vertices of P1 are removed as H1..Hn. Merge P1 into v3, refer the resulting

graph as G′. This results in each of G1..Gn sharing only v3. If n > 0, one of the minimal

2-sum component would be Ga ∪ f

If Gb has an edge, (v2, v3): Merge v2 into v3, call the resulting graph G′. This is same as

Ga ∪ f

Partial 3-tree property is invariant to all the operations used above. Ga ∪ f is same

as Ga ∪ (v1, v3). Similarly for Gb ∪ (v2, v3). As each one of Ga ∪ (v1, v3), Gb ∪ (v2, v3),

(f ∪ (v1, v3) ∪ v2, v3) is a partial 3-tree, G ∪ f ∪ (v1, v3) ∪ (v2, v3) should be a partial 3-tree.

32

Page 33: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

5.2.2 Application to Sampling

Overall the complexity of Step 1 in the sampling algorithm varies based on the choice

of non-edges and the order in which they are fixed. For instance, for the non-edge f in

Figure 5-4, all the minimal 2-sum components containing f have one path, which means

there exists an x by 5.2.2. By adding edge e1, all minimal 2-sum components containing

f in G ∪ x ∪ f are partial 2-trees. This e1 is the x for G ∪ f mentioned in 5.2.2. By

including e1 as part of the Cayley configuration space and fixing f after fixing e1, we

ensure there exists an order such that bounds of f in Step 1 can be computed by solving

linear inequalities. This is important because, for every order the same graph with a

choice of non-edges shown in Figure 5-3 or fixing e1 last in the Figure 5-4 results in

Step 1 solving non-linear inequalities in every iteration. The characteristic described in

Theorem 5.2.2, when true for a graph, avoids such a choice and order, thereby improving

sampling.

e1

f

Figure 5-3. There exists an order where Step 1 solves linear inequalities while samplingthe Cayley configuration space consisting of the dotted edges

5.3 Linear Bounding Box for Partial 2-trees : A Necessary but Not SufficientCondition

Definition 10. A non-edge whose range is expressible by linear inequalities for any δ

assignment for the graph is a linear non-edge.

33

Page 34: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

f

Figure 5-4. For every order, every iteration of Step 1 solves non-linear inequalities whilesampling the Cayley configuration space consisting of the dotted edges

5.3.1 Theorems

Lemma 5.3.1. Given a partial 2-tree G, if between two vertices v1, v2 in G there exists

≥ 3 vertex-disjoint paths [Pi:(i ≤ n and v1, v2 6⊂ Pi ) ] then for i 6= j and i, j ≤ n,

(a) any path connecting Pi to Pj should contain v1 or v2

(b) any non-edge, f , in the path connecting Pi to Pj, not involving v1 or v2 is non-linear

Proof. (a) Let us assume there exists a path between Pi, Pj, not involving v1 or v2.

Denote it as Px.

If an edge exists between v1, v2: Refer the vertex-disjoint path with zero vertices as

P0. Assuming v1, v2 as non-edge, apply the proof of Theorem 5.2 in [1] on G to reduce to

base cases, also discussed in Theorem 5.2 of [1]. In both the base cases one of the minimal

2-sum component containing v1, v2 is a tetrahedron. Hence contradicting the assumption

that G is a partial 2-tree.

If an edge does not exist between v1, v2: Every vertex-disjoint path has at least one

vertex. Pick a path v1, Pk, v2 and merge Pkv2 into v2. This results in an edge between

v1, v2. As none of vertex-disjoint paths between v1, v2 share a vertex with Pk, all the

34

Page 35: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

remaining vertex-disjoint paths will have the same vertex set. Hence the new graph is

similar to the one proved above.

(b)

If an edge exist between v1, v2: Refer the vertex-disjoint path with zero vertices as

P0. Due to the above proof, without a non-edge there cannot exist a path between Pi, Pj,

where i, j ≥ 1. Let f (vfa, vfb) be the non-edge in the path (Px) connecting Pi to Pj, not

involving v1 or v2. Let x1, f, x2 be Px. x1 would be the distinct vertices connecting Pi to

vfa, including a vertex from Pi and vfa. if x1 > 1, merge x1 into a vertex of Pi. As x1 does

not share vertices with any of Px( 1 ≤ x ≤ n and i 6= x). The resulting graph will have

the same number of paths, each consisting the same set of vertices. Repeat the same for

vertices in x2. The new graph looks like the one show in the Figure 5-5. f consists of one

vertex from Pi and other from Pj. Also between the vertices of non-edge f we can see two

vertex-disjoint paths, (vfa to v1 to vfb), (vfa to v2 to vfb). v1, v2 acts as a path between

these two paths. Using the proof of Theorem 5.2, this graph can be reduced to the base

cases. Lemma 5.5 proves existence of a δ for which f is non-linear.

If an edge does not exist between v1, v2: Every vertex-disjoint path has at least

one vertex. Pick a path v1Pkv2 and reduce Pkv2 into v2. This results in an edge between

v1, v2. As none of vertex-disjoint paths between v1, v2 share a vertex with Pk, all the

remaining vertex-disjoint paths will have the same vertex set. This new graph has at least

3 vertex-disjoint paths and is similar to the one proved above.

Theorem 5.3.2. Given a partial 2-tree G, if between two vertices v1, v2 in G there exists

≥ 3 vertex-disjoint paths [Pi:(i ≤ n and v1, v2 6⊂ Pi ) ] such that v1 or v2 are not shared by

1-sum components of G then a complete 3-tree consisting of G and linear non-edges cannot

be constructed.

Proof. If an edge exists between v1, v2: Refer the vertex-disjoint path with zero

vertices as P0.

35

Page 36: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

f

v

v

fa

fb

P

P

vv

1

2

i

j

Figure 5-5. Non-edge f between Pi, Pj

If an edge does not exist between v1, v2: Every vertex-disjoint path has at least one

vertex.

Removal of v1 and v2 from G results in a set of connected components H1...Hk. Refer

Gi as the graph induced by Hi together with v1 and v2. From each Hi, there exists an

edge to both v1 and v2, because missing an edge results in existence of 1-sum components

sharing either v1 or v2. vertices(G) = vertices(H1...Hk) ∪v1 ∪ v2. As all vertices in Gi are

connected, 5.3.1 says there cannot exist more than one vertex-disjoint path. So P1..Pk

and H1...Hk will have a one to one relation. Let us assume the P s and Hs with the same

index have a relation. This implies Pi ⊆ Hi.

A property of 3-tree, any two non-adjacent vertices have exactly 3 vertex-disjoint

paths between them. We know from 5.3.1 that a linear non-edge or edge cannot exist

between Pi, Pj (i 6= j and 1 ≤ i, j ≤ k). So there should exist at least 3 vertex-disjoint

paths between vertices vi, vj ( vi from Pi and vj from Pj). From 5.3.1, an edge or a linear

non-edge between Hi and Hx ( i 6= x and 1 ≤ x ≤ k) cannot exist. So other than between

vertices of Hi, any path between vi and vertices of G contain either v1 or v2, thereby

restricting the number of vertex-disjoint paths to two.

36

Page 37: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

5.3.2 Practical Application

The graph in Figure 3-2 does not have 1-sum components, but have vertex pairs

with ≥ 3 vertex-disjoint paths. According to 5.3.2, it is the characteristic of graphs that

cannot have a bounding box expressible by linear inequalities. On the other hand, in case

of the graph in Figure 1-2, for every vertex pair (e) with ≥ 3 vertex-disjoint paths there

exist a 1-sum component sharing one of its vertex with e. So this graph does not have

the characteristics mentioned in 5.3.2, which means a bounding box expressible by linear

inequalities may or may not exist. But we can see that all the dotted edges match up with

the characteristics mentioned in Theorem 5.2 of [1]. Hence there exists a bounding box

expressible by linear inequalities.

37

Page 38: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

CHAPTER 6CONCLUSION AND OPEN PROBLEM

A configuration of a graph G in d dimension represents G with unique set of

distance values between all of its vertices maintaining the edge values and realizability

in d dimension. Problems like protein folding, macro molecular assembly have EDCS

representations in 3D and desired solution for these problems are usually a set or a series

of configurations. One method employed to obtain these configurations is scanning or

searching through the Cayley configuration space. Efficient structure for the Cayley

configuration space is necessary for sampling. If G ∪ F is d-realizable in d-dimension,

a result of [1] guarantees a squared convex structure for Cayley configuration space.

Obtaining bounds for parameters is crucial for sampling. Using the description of partial

3-trees, we showed gradation in the complexity for computing bounds for various types of

graphs. The complexity ranges from solving a system of linear inequalities to a system of

cubic inequalities.

To avoid solving for such higher degree polynomials, first we proved there exists a

construction of 3-tree for every 2-tree such that all the non-edges in the resulting 3-tree

have non-sharing property. Using this non-sharing property, we then proved that there

exists an order and choice of non-edges for any partial 2-tree such that finding bounds for

the complete Cayley configuration space is either solving a system of linear inequalities

or a quadratic in single variable. We proved this by ensuring all the non-edges that have

linear characteristics, F1, are fixed before any non-edges with non-sharing property, F2, is

fixed.

Sampling algorithm adds edges to the graph for which the configuration space is

being sampled. Edges added in a particular order is beneficial for computing bounds

for some non-edges. We gave an exact characteristics for non-edges, whose bounds are

expressible by linear inequalities on addition of edges that preserve the partial 3-tree

38

Page 39: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

property. We demonstrated with an example how this new classification contributes to

improved sampling of partial 3-trees.

Bounding box is a good approximation of configuration space. Bounding box

expressible by linear inequalities has faster description algorithm. In chapter 4, we proved

a necessary but not sufficient condition for partial 2-trees that have linearly expressible

bounding box. A more interesting question and an open problem is, Characterize partial

2-trees G that can be completed into a 3-tree using a set of non edges S, such that for each

u in S, G ∪ u is still a (partial) 2-tree.

39

Page 40: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

REFERENCES

[1] M. Sitharam and H. Gao, “Characterizing graphs with convex and connectedconguration spaces,” Discrete and Computational Geometry, 2008.

[2] L. M. Blumenthal, Theory and Applications of Distance Geometry. Chelsea PublishingCompany, 1970.

[3] M. Belk and R. Connelly, “Realizability of graphs,” Discrete and ComputationalGeometry, vol. 37, pp. 125 – 137, 2007.

[4] J. E. Graver, Counting on Frameworks: Mathematics to Aid the Design of RigidStructures. The Mathematical Association of America, 2001.

40

Page 41: EFFICIENT BOUNDS FOR 3D CAYLEY …ufdcimages.uflib.ufl.edu/UF/E0/04/22/97/00001/chittamuru...EFFICIENT BOUNDS FOR 3D CAYLEY CONFIGURATION SPACE OF PARTIAL 2-TREES By UGANDHAR REDDY

BIOGRAPHICAL SKETCH

Ugandhar Chittamuru was born into an agricultural family in Venkannapalem,

Nellore(District), India. He grew up in and around Gudur, a small town in Nellore(District),

until he graduated from high school. He went on to pursue a bachelor’s in computer

science from Vellore Institute of Technology, India and graduated in 2005 . Later he

worked in Cognizant Technology Solutions between 2005 and 2007 before pursuing

master’s at University of Florida, Gainesville. During master’s he held a part-time web

developer position in Geomatics department and Florida Museum of Natural History. He

will be graduating with master’s degree in computer engineering in December 2010.

41