10/6/2014BCHB524 - 2014 - Edwards Sequence File Parsing using Biopython BCHB524 2014 Lecture 11.
Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23...
-
Upload
letitia-gardner -
Category
Documents
-
view
217 -
download
0
Transcript of Relational Databases: Object Relational Mappers – SQLObject II BCHB524 2013 Lecture 23...
Relational Databases: Object Relational
Mappers – SQLObject II
BCHB5242013
Lecture 23
11/20/2013 BCHB524 - 2013 - Edwards
11/20/2013 BCHB524 - 2013 - Edwards 2
Relational Databases
Store information in a table Rows represent items Columns represent items' properties or
attributes
Name Continent RegionSurface
AreaPopulatio
nGNP
BrazilSouth
America South America 854740317011500
0 776739
Indonesia Asia Southeast Asia 190456921210700
0 84982
India AsiaSouthern and Central
Asia 328726310136620
00 447114
China Asia Eastern Asia 957290012775580
00 982268
Pakistan AsiaSouthern and Central
Asia 79609515648300
0 61289
United States
North America North America 9363520
278357000 8510700
11/20/2013 BCHB524 - 2013 - Edwards 3
... as Objects
Objects have data members or attributes.
Store objects in a list oriterable.
Abstract awaydetails of underlyingRDBMS
c1 = Country()c1.name = 'Brazil'c1.continent = 'South America'c1.region = 'South America'c1.surfaceArea = 8547403c1.population = 170115000c1.gnp = 776739
# initialize c2, ..., c6
countryTable = [ c1, c2, c3, c4, c5, c6 ]
for cnty in countryTable: if cnty.population > 100000000: print cnty.name, cnty.population
11/20/2013 BCHB524 - 2013 - Edwards 4
Taxonomy Database, from scratch
Specify the model Tables: Taxonomy and Name
Populate basic data-values in the Taxonomy table from “small_nodes.dmp”
Populate the Names table from “small_names.dmp” Insert basic data-values Insert relationship with Taxonomy table
Fix Taxonomy parent relationship Fix Taxonomy derived information Use in a program…
11/20/2013 BCHB524 - 2013 - Edwards 5
Taxonomy Database: model.py
from sqlobject import *import os.path, sys
dbfile = 'small_taxa.db3'
def init(new=False): # Magic formatting for database URI conn_str = os.path.abspath(dbfile) conn_str = 'sqlite:'+ conn_str # Connect to database sqlhub.processConnection = connectionForURI(conn_str) if new: # Create new tables (remove old ones if they exist) Taxonomy.dropTable(ifExists=True) Name.dropTable(ifExists=True) Taxonomy.createTable() Name.createTable()
11/20/2013 BCHB524 - 2013 - Edwards 6
Taxonomy Database: model.py
# model.py continued…
class Taxonomy(SQLObject): taxid = IntCol(alternateID=True) scientific_name = StringCol() rank = StringCol() parent = ForeignKey("Taxonomy")
class Name(SQLObject): taxonomy = ForeignKey("Taxonomy") name = StringCol() name_class = StringCol()
11/20/2013 BCHB524 - 2013 - Edwards 7
Taxonomy Database structure
TaxonomyName
123456
12345
taxonomy: 4 parent: 2
Foreign Key: id number of some other row
taxonomy
parent
parent: 2taxonomy: 4taxonomy: 4
11/20/2013 BCHB524 - 2013 - Edwards 8
Populate Taxonomy table:load_taxa.py
import sysfrom model import *
init(new=True)
# Read in the taxonomy nodes, populate taxid and rankh = open(sys.argv[1])for l in h: l = l.strip('\t|\n') sl = l.split('\t|\t') taxid = int(sl[0]) rank = sl[2] t = Taxonomy(taxid=taxid, rank=rank, scientific_name=None, parent=None)h.close()
11/20/2013 BCHB524 - 2013 - Edwards 9
Populate Name table:load_names.py
import sysfrom model import *
init()
# Read in the names, populate name, class, and id of # taxonomy rowh = open(sys.argv[1])for l in h: l = l.strip('\t|\n') sl = l.split('\t|\t') taxid = int(sl[0]) name_class = sl[3] name = sl[1] t = Taxonomy.byTaxid(taxid) n = Name(name=name, name_class=name_class, taxonomy=t)h.close()
11/20/2013 BCHB524 - 2013 - Edwards 10
Fix up the Taxonomy table:fix_taxa.py
import sysfrom model import *
init()
# Read in the taxonomy nodes, get self and parent taxonomy objects,# and fix the parent field appropriatelyh = open(sys.argv[1])for l in h: l = l.strip('\t|\n') sl = l.split('\t|\t') taxid = int(sl[0]) parent_taxid = int(sl[1]) t = Taxonomy.byTaxid(taxid) p = Taxonomy.byTaxid(parent_taxid) t.parent = ph.close()
# Find all scientific names and fix their taxonomy objects' scientific# name files appropriatelyfor n in Name.select(Name.q.name_class == 'scientific name'): n.taxonomy.scientific_name = n.name
11/20/2013 BCHB524 - 2013 - Edwards 11
Back to the Taxonomy example
Each taxonomy entry can have multiple names Many names can point (ForeignKey) to a single
taxonomy entry name → taxonomy is easy... taxonomy → list of names requires a select
statement from model import *init()hs = Taxonomy.byTaxid(9606)for n in Name.select(Name.q.taxonomy==hs): print n.name
11/20/2013 BCHB524 - 2013 - Edwards 12
Taxonomy Database structure
TaxonomyName
123456
12345
taxonomy: 4 parent: 2
Foreign Key: id number of some other row
taxonomy
parent
parent: 2taxonomy: 4taxonomy: 4
11/20/2013 BCHB524 - 2013 - Edwards 13
Taxonomy table relationships
This relationship (one-to-many) is called a multiple join.
Related joins(many-to-many)too...
class Taxonomy(SQLObject): # other data members names = MultipleJoin("Name") children = MultipleJoin("Taxonomy",joinColumn='parent_id')
from model import *init()hs = Taxonomy.byTaxid(9606)for n in hs.names: print n.namefor c in hs.children: print c.scientific_name
11/20/2013 BCHB524 - 2013 - Edwards 14
SQLObject Exceptions
What happens when the row isn't in the table?
from model import *
try: hs = Taxonomy.get(7921) hs = Taxonomy.byTaxid(9606)except SQLObjectNotFound: # if row id 7921 / Tax id 9606 is not in table...
results = Taxonomy.selectBy(taxid=9606)if results.count() == 0: # No rows satisfy the constraint! try: first_item = results[0]except IndexError: # No first item in the results
11/20/2013 BCHB524 - 2013 - Edwards 15
Example Programimport sysfrom model import *init()
try: taxid = int(sys.argv[1])except IndexError: print >>sys.stderr, "Need a taxonomy id argument" sys.exit(1)except ValueError: print >>sys.stderr, "Taxonomy id should be an intenger" sys.exit(1) #Get taxonomy rowtry: t = Taxonomy.byTaxid(taxid)except SQLObjectNotFound: print >>sys.stderr, "Taxonomy id",taxid,"does not exist" sys.exit(1)
for n in t.names: print "Organism",t.scientific_name,"has name",n.namefor c in t.children: print "Organism",t.scientific_name,"has child",c.scientific_name,c.taxidprint "Organism",t.scientific_name,"has parent",t.parent.scientific_name,t.parent.taxid
11/20/2013 BCHB524 - 2013 - Edwards 16
Example Program# Continued...
# Iterate up through the taxonomy tree from t, to find its genusr = tg = Nonewhile r != r.parent: if r.rank == 'genus': g = r break r = r.parent
if g == None: print "Organism",t.scientific_name,"has no genus"else: print "Organism",t.scientific_name,"has genus",g.scientific_name
11/20/2013 BCHB524 - 2013 - Edwards 17
Exercises
Write a python program using SQLObject to find the taxonomic lineage of a user-supplied organism name. Make sure you use the small_taxa.db3 file from
the course data-folder