Matthew P. Johnson, OCL1, CISDD CUNY, F20041 OCL1 Oracle 10g: SQL & PL/SQL Session #10 Matthew P....

66
Matthew P. Johnson, OCL1, CISDD CUNY , F2004 1 OCL1 Oracle 10g: SQL & PL/SQL Session #10 Matthew P. Johnson CISDD, CUNY Fall, 2004
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    226
  • download

    2

Transcript of Matthew P. Johnson, OCL1, CISDD CUNY, F20041 OCL1 Oracle 10g: SQL & PL/SQL Session #10 Matthew P....

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

1

OCL1 Oracle 10g:SQL & PL/SQLSession #10

Matthew P. Johnson

CISDD, CUNY

Fall, 2004

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

2

Agenda Web apps & security

Oracle & XML

RegEx support in 10g

More on the PL/SQL labs

Today’s lab

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

3

Review: Why security is hard It’s a “negative deliverable”

It’s an asymmetric threat

Tolstoy: “Happy families are all alike; every unhappy family is unhappy in its own way.” Analogs: “homeland”, jails, debugging, proof-

reading, Popperian science, fishing, MC algs

So: fix biggest problems first

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

4

Injection attacks – DB web apps

Consider another input: user: your-boss pass: ' OR 1=1 OR pass = '

SELECT * FROM usersWHERE user = u AND pass = p;

SELECT * FROM usersWHERE user = u AND pass = p;

SELECT * FROM usersWHERE user = 'your-boss' AND password = '' OR 1=1 OR pass = '';

SELECT * FROM usersWHERE user = 'your-boss' AND password = '' OR 1=1 OR pass = '';

http://pages.stern.nyu.edu/~mjohnson/dbms/perl/login.cgiCopy from: http://pages.stern.nyu.edu/~mjohnson/dbms/perl/injection.txt

SELECT * FROM usersWHERE user = 'your-boss'

AND pass = ''OR 1=1OR pass = '';

SELECT * FROM usersWHERE user = 'your-boss'

AND pass = ''OR 1=1OR pass = '';

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

5

Multi-command injection attacks

Consider another input: user: '; DROP TABLE users; SELECT FROM users WHERE pass = '

pass: abc

SELECT * FROM usersWHERE user = u AND pass = p;

SELECT * FROM usersWHERE user = u AND pass = p;

SELECT * FROM users

WHERE user = ''; DROP TABLE users; SELECT FROM users WHERE password = '' AND password = 'abc';

SELECT * FROM users

WHERE user = ''; DROP TABLE users; SELECT FROM users WHERE password = '' AND password = 'abc';

SELECT * FROM users WHERE user = '';DROP TABLE users;SELECT FROM users WHERE pass = '' AND pass = 'abc';

SELECT * FROM users WHERE user = '';DROP TABLE users;SELECT FROM users WHERE pass = '' AND pass = 'abc';

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

6

Multi-command injection attacks

Consider another input: user: '; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE pass = '

pass: abc

SELECT * FROM usersWHERE user = u AND pass = p;

SELECT * FROM usersWHERE user = u AND pass = p;

SELECT * FROM users

WHERE user = ''; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = '' AND password = 'abc';

SELECT * FROM users

WHERE user = ''; SHUTDOWN WITH NOWAIT; SELECT FROM users WHERE password = '' AND password = 'abc';

SELECT * FROM users WHERE user = '';SHUTDOWN WITH NOWAIT;SELECT FROM users WHERE pass = '' AND pass = 'abc';

SELECT * FROM users WHERE user = '';SHUTDOWN WITH NOWAIT;SELECT FROM users WHERE pass = '' AND pass = 'abc';

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

7

http://pages.stern.nyu.edu/~mjohnson/dbms/perl/users.cgi

Injection attacks – other inputs

Consider another input: user: ' OR 1=1 OR user = ' pass: ' OR 1=1 OR user = '

Delete everyone!

DELETE FROM usersWHERE user = u AND pass = p;

DELETE FROM usersWHERE user = u AND pass = p;

DELETE FROM users

WHERE user = '' OR 1=1 OR user = '' AND pass = '' OR 1=1 OR user = '';

DELETE FROM users

WHERE user = '' OR 1=1 OR user = '' AND pass = '' OR 1=1 OR user = '';

DELETE FROM usersWHERE user = ''

OR 1=1OR user = ''AND pass = ''OR 1=1OR user = '';

DELETE FROM usersWHERE user = ''

OR 1=1OR user = ''AND pass = ''OR 1=1OR user = '';

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

8

Preventing injection attacks Source of problem (in SQL case): use of

quotes Soln 1: don’t allow quotes!

Reject any entered data containing single quotes Q: Is this satisfactory?

Does Amazon need to sell O’Reilly books?

Soln 2: escape any single quotes Replace any ‘ with a ‘’ or \’ In PHP, turn on magic_quotes_gpc flag in .htaccess show both versions

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

9

Preventing injection attacks When to do security checking for quotes,

etc.? Natural choice: in client-side data validation But not enough!

As saw: can still manually submit GET and POST

Must do security checking on server

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

10

Preventing injection attacks Soln 3: use prepare parameterized queries

Supported in JDBC, Perl DBI, PHP ext/mysqli http://pages.stern.nyu.edu/~mjohnson/dbms/perl/loginsafe.cgi http://pages.stern.nyu.edu/~mjohnson/dbms/perl/userssafe.cgi

Very dangerous: using tainted data to run commands at the Unix command prompt Semi-colons, prime char, etc. Safest: define set if legal chars, not illegal ones

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

11

More Info phpGB MySQL Injection Vulnerability

http://www.securiteam.com/unixfocus/6X00O1P5PY.html

"How I hacked PacketStorm“ http://www.wiretrip.net/rfp/txt/rfp2k01.txt

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

12

And now for something completely different: XML XML: eXtensible Mark-up Language

Very popular language for semi-structured data

Mark-up language: consists of elements composed of tags, like HTML

Emerging lingua franca of the Internet, Web Services, inter-vender comm

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

13

Unstructured data At one end of continuum: unstructured data

Text files Stock market prices CIA intelligence intercepts Audio recordings “Just one damn bit after another”

Henry Ford

No (intentional, formal) patterns to the data Difficult to manage/make sense of

Why we need data-mining

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

14

Structured data At the other end: structured data

Tables in RDBMSs Data organized into semantic chunks

entities Similar/related entities grouped together

Relationships, classes Entities in same group have same structure

Same fields/attributes/properties

Easy to make sense of But sometimes too rigid a req. Difficult to send—convert to tab-delimited

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

15

Semi-structured data Not too random

Data organized into entities Similar/related grouped to form other entities

Not too structured Some attributes may be missing Size of attributes may vary

Support of lists/sets

Juuust Right Data is self-describing

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

16

Semi-structured data Predominant examples:

HTML: HyperText Mark-up Language XML: eXtensible Mark-up Language

NB: both mark-up languages (use tags) Mark-up lends self of semi-structured data

Demarcate boundaries for entities But freely allow other entities inside

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

17

Data model for semi-structured data Usually represented as directed graphs Graph: set of vertices (nodes) and edges

Dots connected by lines; not nec. a tree!

In model, Nodes ~ entities or fields/attributes Edges ~ attribute-of/sub-entity-of

Example: publisher publishes >=0 books Each book has one title, one year, >=1 authors Draw publishers graph

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

18

XML is a SSD language Standard published by W3C

Officially announced/recommended in 1998

XML != HTML XML != a replacement for HTML Both are mark-up languages

Big diffs:1. XML doesn’t use predefined tags (!)

But it’s extensible: tags can be added2. HTML is about presentation: <I>, <B>, <P>

XML is about content: <book>, <author>

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

19

XML syntax Like HTML in many respects but more strict

All tags must be closed Can’t have: this is a line<br> Every start tag has an end tag Although <br/> style can replace both

IS case-sensitive IS space-sensitive

XML doc has a unique root element

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

20

XML syntax Tags must be properly nested

Not allowed <b><i>I’m not kidding</b></i> Intuition: file folders

Elements may have quoted attributes <Myelm myatt=“myval”>…</Myelm>

Comments same as in HTML: <!-- Pay no attention… -->

Draw publishers XML

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

21

Escape chars in XML Some chars must be escaped

Distinguish content from syntax

Can also declare value to be pure text:

> &lt;

< &gt;

& &amp;

" &quot;

' &apos;

<aRealTag> <![CDATA[<notAtag>jsdljsd<neitherAmI<“’><>>]]></aRealTag>

<aRealTag> <![CDATA[<notAtag>jsdljsd<neitherAmI<“’><>>]]></aRealTag>

<elm>3 &lt; 5</elm><elm>3 &lt; 5</elm>

<elm>&quot;Don&apos;t call me &apos;Shirley&apos;!&quot;</elm>

<elm>&quot;Don&apos;t call me &apos;Shirley&apos;!&quot;</elm>

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

22

XML Namespaces Different schemas/DTDs may overlap

XHTML and MathML share some tags Soln: namespaces

as in Java/C++/C#

<book xmlns:isbn=“www.isbn-org.org/def”>

<title> … </title>

<number> 15 </number>

<isbn:number> …. </isbn:number>

</book>

<book xmlns:isbn=“www.isbn-org.org/def”>

<title> … </title>

<number> 15 </number>

<isbn:number> …. </isbn:number>

</book>

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

23

From Relational Data to XML Data

<persons><row> <name>John</name> <phone>

3634</phone></row> <row> <name>Sue</name> <phone> 6343</phone> <row> <name>Dick</name> <phone>

6363</phone></row></persons>

<persons><row> <name>John</name> <phone>

3634</phone></row> <row> <name>Sue</name> <phone> 6343</phone> <row> <name>Dick</name> <phone>

6363</phone></row></persons>

n a m e p h o n e

J o h n 3 6 3 4

S u e 6 3 4 3

D i c k 6 3 6 3

row row row

name name namephone phone phone

“John” 3634 “Sue” “Dick”6343 6363

persons XML: persons

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

24

Semi-structured Data Explained List-valued attributes

XML is not 1NF!

Impossible in (single) tables:

<person> <name> Mary</name> <phone>2345</phone> <phone>3456</phone></person>

<person> <name> Mary</name> <phone>2345</phone> <phone>3456</phone></person>

two phones !

name phone

Mary 2345 3456 ???

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

25

Object ids and References SSD graph might not be trees! But XML docs must be

Would cause much redundancy Soln: same concept as pointers in C/C++/J

Object ids and references

Graph example: Movies: Lost in Translation, Hamlet Stars: Bill Murray, Scarlet Johansson

<movieinfo>

<movie id=“o111”>

<title>Lost in Translation</title>

<year>2003</year>

<stars idref=“o333 o444”/>

</movie>

<movie id=“o222”>

<title>Hamlet</title>

<year>1999</year>

<stars idref=“o333”/>

</movie> <person id=“o456”>

<person id=“o111”>

<name>Bill Murray</name>

<movies idref=“o111 o222”/>

</person>

</movieinfo>

<movieinfo>

<movie id=“o111”>

<title>Lost in Translation</title>

<year>2003</year>

<stars idref=“o333 o444”/>

</movie>

<movie id=“o222”>

<title>Hamlet</title>

<year>1999</year>

<stars idref=“o333”/>

</movie> <person id=“o456”>

<person id=“o111”>

<name>Bill Murray</name>

<movies idref=“o111 o222”/>

</person>

</movieinfo>

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

26

What do we do with XML? Things done with XML:

Send to partners Parse XML received Convert to RDBMS rows Query for particular data Convert to other XML Convert to formats other than XML

Lots of tools/standards for these…

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

27

DTDs & understanding XML XML is extensible Advantage: when creating, we can use any

tags we like Disadv: when reading, they can use any tags

they like Using XML docs a priori is very difficult

Solution: impose some constraints

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

28

DTDs DTD: Document Type Definition

You and partners/vertical industry/academic discipline decide on a DTD/schema for your docs Specify which entities you may use/must understand Specify legal relationships

DTD specifies the grammar to be used DTD = set of rules for creating valid entities

DTD tells your software what to look for in doc

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

29

DTD examples Well-formed XML v. valid XML

Simple example: http://pages.stern.nyu.edu/~mjohnson/dbms/xml/note.xml http://pages.stern.nyu.edu/~mjohnson/dbms/xml/badnote.xml http://pages.stern.nyu.edu/~mjohnson/dbms/xml/badnote2.xml Copy from: http://pages.stern.nyu.edu/~mjohnson/dbms/eg/xml.txt

Partial publisher example rules: Root publisher Publisher name, book*, author* Book title, date, author+ Author firstname, middlename?, lastname

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

30

Partial DTD example (typos!)<?xml version=“1.0” encoding=“UTF-8” ?><!DOCTYPE PUBLISHER [<!ELEMENT PUBLISHER (name, book*, author*)><!ELEMENT name (#PCDATA)><!ELEMENT BOOK (title, date, author+)><!ELEMENT AUTHOR (firstname, middlename?,

lastname><!ELEMENT firstname (#PCDATA)><!ELEMENT lastname (#PCDATA)><!ELEMENT middlename (#PCDATA)>

<?xml version=“1.0” encoding=“UTF-8” ?><!DOCTYPE PUBLISHER [<!ELEMENT PUBLISHER (name, book*, author*)><!ELEMENT name (#PCDATA)><!ELEMENT BOOK (title, date, author+)><!ELEMENT AUTHOR (firstname, middlename?,

lastname><!ELEMENT firstname (#PCDATA)><!ELEMENT lastname (#PCDATA)><!ELEMENT middlename (#PCDATA)>

DTD is not XML, but can be embedded in or ref.ed from XML Replacement for DTDs is XML Schemas

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

31

XML Applications/dialects MathML: Mathematical Markup Language

http://wwwasdoc.web.cern.ch/wwwasdoc/WWW/publications/ictp99/ictp99N8059.html

VoiceXML: http://newmedia.purchase.edu/~Jeanine/interfaces/rps.xml

ChemML: Chemical Markup Language

XHMTL: HTML retrofitted as an XML application

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

32

SQL*Plus settings

SQL> SET RECSEP OFFSQL> COLUMN text FORMAT A60

SQL> SET RECSEP OFFSQL> COLUMN text FORMAT A60

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

33

XML in Oracle - purchase-order example<?xml version="1.0"?>

<purchase_order> <customer_name>Alpha Tech</customer_name> <po_number>11257></po_number> <po_date>2004-01-20</po_date> <po_items> <item> <part_number>AI5-4557</part_number> <quantity>20</quantity> </item> <item> <part_number>EI-T5-001</part_number> <quantity>12</quantity> </item> </po_items></purchase_order>

<?xml version="1.0"?><purchase_order> <customer_name>Alpha Tech</customer_name> <po_number>11257></po_number> <po_date>2004-01-20</po_date> <po_items> <item> <part_number>AI5-4557</part_number> <quantity>20</quantity> </item> <item> <part_number>EI-T5-001</part_number> <quantity>12</quantity> </item> </po_items></purchase_order>

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

34

Storing XML data As of 9i, has XMLType data type

By default, underlying storage is as CLOB

CREATE TABLE purchase_order( po_id number(5) not null, customer_po_nbr varchar(20), customer_inception_date date, order_nbr number(5), purchase_order_doc xmltype, constraint purchase_order_pk primary key(po_id));

CREATE TABLE purchase_order( po_id number(5) not null, customer_po_nbr varchar(20), customer_inception_date date, order_nbr number(5), purchase_order_doc xmltype, constraint purchase_order_pk primary key(po_id));

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

35

Loading XML into Oracle First, log in as sys:

Now scott can import:

connect sys/junk as sysdbacreate directory xml_data as '/xml‘;grant read, write on directory xml_data to scott;

connect sys/junk as sysdbacreate directory xml_data as '/xml‘;grant read, write on directory xml_data to scott;

connect scott/tiger

declare bf1 bfile;beginbf1 := bfilename('XML_DATA', 'purch_ord.xml');insert into purchase_order(po_id, purchase_order_doc) values(1000, xmltype(bf1,

nls_charset_id('we8mswin1252')));end;

connect scott/tiger

declare bf1 bfile;beginbf1 := bfilename('XML_DATA', 'purch_ord.xml');insert into purchase_order(po_id, purchase_order_doc) values(1000, xmltype(bf1,

nls_charset_id('we8mswin1252')));end;

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

36

Loading XML into Oracle Not just loading raw text

XMLType data must be well-formed Parsable as XML

Try modifying customer_name open tag

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

37

Accessing XML in Oracle Now can look at raw XML:

Can also use XPath to extract particular nodes and values, with extract function:

SQL> SELECT purchase_order_docFROM purchase_order;

SQL> SELECT purchase_order_docFROM purchase_order;

SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;

SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

38

XPath in Oracle Can also extract all nodes of one type, underneath some

node, with double-slash // All purchase order items

NB: this is not valid XML No unique root Can request just one with bracket op Numbering starts at 1, not 0 Wrong name/number no error, no results

SQL> SELECT extract(purchase_order_doc, '/purchase_order/po_items/item[1]')FROM purchase_order;

SQL> SELECT extract(purchase_order_doc, '/purchase_order/po_items/item[1]')FROM purchase_order;

SQL> SELECT extract(purchase_order_doc, '/purchase_order//item')FROM purchase_order;

SQL> SELECT extract(purchase_order_doc, '/purchase_order//item')FROM purchase_order;

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

39

extract v. extractvalue extractvalue returns value, not whole node:

vs.

extractvalue applies only to unique nodes:

SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;

SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;

SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;

SQL> SELECT extract(purchase_order_doc, '/purchase_order/customer_name')FROM purchase_order;

SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/po_items')FROM purchase_order;

SQL> SELECT extractvalue(purchase_order_doc, '/purchase_order/po_items')FROM purchase_order;

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

40

existsnode function Can check whether node/location exists with

existnode function Returns 1 or 0

Also applies to bracketed paths:

SQL> SELECT po_id FROM purchase_orderWHERE existsnode(purchase_order_doc, '/purchase_order/customer_name') = 1;

SQL> SELECT po_id FROM purchase_orderWHERE existsnode(purchase_order_doc, '/purchase_order/customer_name') = 1;

SQL> SELECT po_id FROM purchase_orderWHERE existsnode(purchase_order_doc, '/purchase_order/po_items/item[1]') = 1;

SQL> SELECT po_id FROM purchase_orderWHERE existsnode(purchase_order_doc, '/purchase_order/po_items/item[1]') = 1;

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

41

Moving data from XML to relations To move single values from XML to tables, can

simply use extractvalue in UPDATE statements:

SQL> UPDATE purchase_orderSET order_nbr = 7101,customer_po_nbr = extractvalue(purchase_order_doc, '/purchase_order/po_number'),customer_inception_date =

to_date(extractvalue(purchase_order_doc,'/purchase_order/po_date'), 'yyyy-mm-dd');

SQL> UPDATE purchase_orderSET order_nbr = 7101,customer_po_nbr = extractvalue(purchase_order_doc, '/purchase_order/po_number'),customer_inception_date =

to_date(extractvalue(purchase_order_doc,'/purchase_order/po_date'), 'yyyy-mm-dd');

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

42

Moving data from XML to relations What about moving set of nodes

The two item nodes

Use xmlsequence to get a varray of items Use TABLE to convert to a relation

SQL> SELECT extract(purchase_order_doc, '/purchase_order//item')

FROM purchase_order;

SQL> SELECT extract(purchase_order_doc, '/purchase_order//item')

FROM purchase_order;

SQL> SELECT rownum, item.* FROM TABLE(SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item'))FROM purchase_order) item;

SQL> SELECT rownum, item.* FROM TABLE(SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item'))FROM purchase_order) item;

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

43

Moving data from XML to relations Result is a two-row relation with XMLTypes Can use extractvalue to extract this data First, create destination table:

CREATE TABLE LINE_ITEM( ORDER_NBR NUMBER(9) NOT NULL, PART_NBR VARCHAR2(20) NOT NULL, QTY NUMBER(5) NOT NULL, FILLED_QTY NUMBER(5), CONSTRAINT line_item_pk PRIMARY KEY (ORDER_NBR,PART_NBR));

CREATE TABLE LINE_ITEM( ORDER_NBR NUMBER(9) NOT NULL, PART_NBR VARCHAR2(20) NOT NULL, QTY NUMBER(5) NOT NULL, FILLED_QTY NUMBER(5), CONSTRAINT line_item_pk PRIMARY KEY (ORDER_NBR,PART_NBR));

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

44

Moving data from XML to relations Then insert results:

SQL> INSERT INTO line_item(order_nbr,part_nbr,qty)SELECT 7109, extractvalue(column_value, '/item/part_number'),

extractvalue(column_value, '/item/quantity')FROM TABLE(

SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item'))

FROM purchase_order);

SQL> INSERT INTO line_item(order_nbr,part_nbr,qty)SELECT 7109, extractvalue(column_value, '/item/part_number'),

extractvalue(column_value, '/item/quantity')FROM TABLE(

SELECT xmlsequence(extract(purchase_order_doc, '/purchase_order//item'))

FROM purchase_order);

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

45

XML Schemas and Oracle By default, XML must be well-formed to be read into

the XMLType field XML is valid if it conforms to a schema To use a schema with Oracle, must first register it:

declare bf1 bfile;beginbf1 := bfilename('XML_DATA',

'purch_ord.xsd');dbms_xmlschema.registerschema('http://localhost:8080/home/xml/schemas/purch_ord.xsd', bf1);end;

declare bf1 bfile;beginbf1 := bfilename('XML_DATA',

'purch_ord.xsd');dbms_xmlschema.registerschema('http://localhost:8080/home/xml/schemas/purch_ord.xsd', bf1);end;

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

46

XML Schemas and Oracle With schema registered, can apply it to an XMLType field

CREATE TABLE purchase_order2 (po_id NUMBER(5) NOT NULL, customer_po_nbr VARCHAR2(20), customer_inception_date DATE, order_nbr NUMBER(5), purchase_order_doc XMLTYPE, CONSTRAINT purchase_order2_pk PRIMARY KEY (po_id))XMLTYPE COLUMN purchase_order_doc XMLSCHEMA "http://localhost:8080/home/xml/schemas/purch_ord.xsd"

ELEMENT "purchase_order";

CREATE TABLE purchase_order2 (po_id NUMBER(5) NOT NULL, customer_po_nbr VARCHAR2(20), customer_inception_date DATE, order_nbr NUMBER(5), purchase_order_doc XMLTYPE, CONSTRAINT purchase_order2_pk PRIMARY KEY (po_id))XMLTYPE COLUMN purchase_order_doc XMLSCHEMA "http://localhost:8080/home/xml/schemas/purch_ord.xsd"

ELEMENT "purchase_order";

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

47

Importing to schema field Try to import xml file, get error:

declare bf1 bfile;begin bf1 := bfilename('XML_DATA', 'purch_ord.xml'); insert into purchase_order2(po_id, purchase_order_doc) values (2000, XMLTYPE(bf1, nls_charset_id('WE8MSWIN1252')));end;

declare bf1 bfile;begin bf1 := bfilename('XML_DATA', 'purch_ord.xml'); insert into purchase_order2(po_id, purchase_order_doc) values (2000, XMLTYPE(bf1, nls_charset_id('WE8MSWIN1252')));end;

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

48

Importing to schema field Root node of XML must specify the schema Change root to the following:

Now can import Also fails if extra or missing nodes

Modify company_name node Add new comments node

<purchase_order xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:noNamespaceSchemaLocation="http://localhost:8080/home/xml/schemas/purch_ord.xsd">

<purchase_order xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:noNamespaceSchemaLocation="http://localhost:8080/home/xml/schemas/purch_ord.xsd">

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

49

Can check to see whether schema is used Can call isSchemaBased(), getSchemaURL()

and isSchemaValid() on XMLType fields:

SQL> select po.purchase_order_doc.isSchemaBased(),po.purchase_order_doc.getSchemaURL(),po.purchase_order_doc.isSchemaValid()

from purchase_order2 po;

SQL> select po.purchase_order_doc.isSchemaBased(),po.purchase_order_doc.getSchemaURL(),po.purchase_order_doc.isSchemaValid()

from purchase_order2 po;

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

50

Updating XMLType data Can update XMLType data with ordinary

UPDATE statements:

Replaces whole XMLType object with new one

SQL> UPDATE purchase_order poSET po.purchase_order_doc = XMLTYPE(BFILENAME('XML_DATA', 'purch_ord_alt.xml'), nls_charset_id('WE8MSWIN1252'))WHERE po.po_id = 2000;

SQL> UPDATE purchase_order poSET po.purchase_order_doc = XMLTYPE(BFILENAME('XML_DATA', 'purch_ord_alt.xml'), nls_charset_id('WE8MSWIN1252'))WHERE po.po_id = 2000;

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

51

Updating XMLType data Can also modify the existing XMLType object

By writing node values updateXML() function does search/replace

But searches for node, not value

SQL> SELECT extract(po.purchase_order_doc,'/purchase_order/customer_name') FROM purchase_order poWHERE po_id = 1000;

SQL> UPDATE purchase_order poSET po.purchase_order_doc = updateXML(po.purchase_order_doc,'/purchase_order/customer_name/text()', 'some other company')WHERE po.po_id = 1000;

SQL> SELECT extract(po.purchase_order_doc,'/purchase_order/customer_name') FROM purchase_order poWHERE po_id = 1000;

SQL> UPDATE purchase_order poSET po.purchase_order_doc = updateXML(po.purchase_order_doc,'/purchase_order/customer_name/text()', 'some other company')WHERE po.po_id = 1000;

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

52

Updating XMLType data Can also write whole node, using XMLType:

Validation/well-formedness is still checked

SQL> UPDATE purchase_order poSET po.purchase_order_doc =

updateXML(po.purchase_order_doc,'/purchase_order/customer_name',XMLTYPE('<customer_name>some third

company</customer_name>'))WHERE po.po_id = 1000;

SQL> SELECT extract(po.purchase_order_doc,'/purchase_order/customer_name')

FROM purchase_order poWHERE po_id = 1000;

SQL> UPDATE purchase_order poSET po.purchase_order_doc =

updateXML(po.purchase_order_doc,'/purchase_order/customer_name',XMLTYPE('<customer_name>some third

company</customer_name>'))WHERE po.po_id = 1000;

SQL> SELECT extract(po.purchase_order_doc,'/purchase_order/customer_name')

FROM purchase_order poWHERE po_id = 1000;

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

53

Updating XMLType data And can update items in a collection:

SQL> SELECT extract(po.purchase_order_doc, '/purchase_order//item')FROM purchase_order poWHERE po.po_id = 1000;

SQL> UPDATE purchase_order poSET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/po_items/item[1]', XMLTYPE('<item><part_number>T-1000</part_number><quantity>33</quantity></item>'))WHERE po.po_id = 1000;

SQL> SELECT extract(po.purchase_order_doc, '/purchase_order//item')FROM purchase_order poWHERE po.po_id = 1000;

SQL> UPDATE purchase_order poSET po.purchase_order_doc = updateXML(po.purchase_order_doc, '/purchase_order/po_items/item[1]', XMLTYPE('<item><part_number>T-1000</part_number><quantity>33</quantity></item>'))WHERE po.po_id = 1000;

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

54

Converting relational data to XML Saw how to put XML in a table Conversely, can convert ordinary relational

data to XML XMLElement() generates an XML node

First, create supplier table:CREATE TABLE SUPPLIER( SUPPLIER_ID NUMBER(5) NOT NULL, NAME VARCHAR2(30) NOT NULL, PRIMARY KEY (SUPPLIER_ID));insert into supplier values(1, 'Acme');insert into supplier values(2, 'Tilton');insert into supplier values(3, 'Eastern');

CREATE TABLE SUPPLIER( SUPPLIER_ID NUMBER(5) NOT NULL, NAME VARCHAR2(30) NOT NULL, PRIMARY KEY (SUPPLIER_ID));insert into supplier values(1, 'Acme');insert into supplier values(2, 'Tilton');insert into supplier values(3, 'Eastern');

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

55

Converting relational data to XML Now can call XMLElement function to wrap values in

tags:

And can build it up:

Don’t concatenate! Turns to strings, escapes < > Error in book

SELECT XMLElement("supplier_id", s.supplier_id) ||XMLElement("name", s.name) xml_fragment

FROM supplier s;

SELECT XMLElement("supplier_id", s.supplier_id) ||XMLElement("name", s.name) xml_fragment

FROM supplier s;

SELECT XMLElement("supplier",XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name))

FROM supplier s;

SELECT XMLElement("supplier",XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name))

FROM supplier s;

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

56

XMLForest() More simply, can use XMLForest() function:

SELECT XMLElement("supplier", XMLForest(s.supplier_id, s.name))FROM supplier s;

SELECT XMLElement("supplier", XMLForest(s.supplier_id, s.name))FROM supplier s;

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

57

XMLAgg() Can use XMLAgg() to put nodes together

inside another node:

|| typo again…

SELECT XMLElement("supplier_list", XMLAgg(XMLElement("supplier", XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name) ))) xml_documentFROM supplier s;

SELECT XMLElement("supplier_list", XMLAgg(XMLElement("supplier", XMLElement("supplier_id", s.supplier_id), XMLElement("name", s.name) ))) xml_documentFROM supplier s;

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

58

New topic: Regular Expressions In automata theory, Finite Automata are the

simplest weakest of computer, Turing Machines the strongest Chomsky’s Hierarchy

FA are equivalent to a regular expression Expressions that specify a pattern Can check whether a string matches the pattern

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

59

RegEx matching Use REGEX_LIKE Metachar for any char is . First, get employee_comment table:

http://pages.stern.nyu.edu/~mjohnson/oracle/empcomm.sql

Now do search:

So far, like LIKE

SELECT emp_id, textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');

SELECT emp_id, textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

60

RegEx matching Can also pull out the matching text with

REGEXP_SUBSTR:

If want only numbers, can specify a set of chars rather than a dot:

SELECT emp_id, REGEXP_SUBSTR(text,'...-....') textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');

SELECT emp_id, REGEXP_SUBSTR(text,'...-....') textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');

SELECT emp_id, REGEXP_SUBSTR(text, '[0123456789]..-....') textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');

SELECT emp_id, REGEXP_SUBSTR(text, '[0123456789]..-....') textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

61

RegEx matching Or can specify a range of chars:

Or, finally, can state how many copies to match:

SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]..-....') textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');

SELECT emp_id, REGEXP_SUBSTR(text, '[0-9]..-....') textFROM employee_commentWHERE REGEXP_LIKE(text,'...-....');

SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{4}') text

FROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{4}');

SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{4}') text

FROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{4}');

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

62

RegExp matching Other operators:

* - 0 or more matches + - 1 or more matches ? - 0 or 1 match

Also, can OR options together with | op Here: some phone nums have area codes, some

not, so want to match both:

SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}-

[0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{4}');

SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}-

[0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{4}');

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

63

RegExp matching Order of ORed together patterns matters:

First matching pattern wins

SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{3}-

[0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{3}-[0-9]{4}');

SELECT emp_id, REGEXP_SUBSTR(text,'[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{3}-

[0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'[0-9]{3}-[0-9]{4}|[0-9]{3}-[0-9]{3}-[0-9]{4}');

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

64

RegExp matching There’s a shared structure between the two,

tho Area code is just optional Can use ? op

SELECT emp_id, REGEXP_SUBSTR(text,'([0-9]{3}-)?[0-9]{3}-[0-9]{4}') text

FROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}-)?[0-9]{3}-[0-9]{4}');

SELECT emp_id, REGEXP_SUBSTR(text,'([0-9]{3}-)?[0-9]{3}-[0-9]{4}') text

FROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}-)?[0-9]{3}-[0-9]{4}');

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

65

RegExp matching Also, different kinds of separators:

dash, dot, just blank Can OR together whole number patterns Better: Just use set of choices of each sep.

SELECT emp_id, REGEXP_SUBSTR(text, '([0-9]{3}[-. ])?[0-9]{3}[-. ][0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}[-. ])?[0-9]{3}[-. ][0-9]{4}');

SELECT emp_id, REGEXP_SUBSTR(text, '([0-9]{3}[-. ])?[0-9]{3}[-. ][0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}[-. ])?[0-9]{3}[-. ][0-9]{4}');

Matthew P. Johnson, OCL1, CISDD CUNY, F2004

66

RegExp matching One other thing: area codes in parentheses

Of course, area codes are still optional Parentheses must be escaped - \( \)

SELECT emp_id, REGEXP_SUBSTR(text, '([0-9]{3}[-. ]|\([0-9]{3}\) )?[0-9]{3}[-. ][0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}[-. ]|\([0-9]{3}\) )?[0-9]{3}[-. ][0-9]{4}');

SELECT emp_id, REGEXP_SUBSTR(text, '([0-9]{3}[-. ]|\([0-9]{3}\) )?[0-9]{3}[-. ][0-9]{4}') textFROM employee_commentWHERE REGEXP_LIKE(text,'([0-9]{3}[-. ]|\([0-9]{3}\) )?[0-9]{3}[-. ][0-9]{4}');