+ The Open Access Publisher Agenda Oracle interMedia Overview Open Access for the Life Science...
-
Upload
dennis-norton -
Category
Documents
-
view
219 -
download
3
Transcript of + The Open Access Publisher Agenda Oracle interMedia Overview Open Access for the Life Science...
+
The Open Access Publisher
Agenda
Oracle interMedia Overview Open Access for the Life Science Community BioMed Central Business Model Oracle Technologies used by BioMed Central
Oracle interMedia
Multimedia DatabasesMulti-Terabyte Performance
Agenda
• The Media-enabled Oracle Platform
Benefits
Customer Experience
Oracle Database 10g New Features
Proposed Enhancements
The Media-enabled Oracle Platform Oracle Database 10g
– Storage, management, & retrieval of image, audio, video data
– Native format understanding, metadata extraction, methods for image processing
– Support for leading streaming media servers
Oracle Application Server 10g
– JSP, Servelet and PL/SQL application development support
– Media Adaptation Services for Wireless – JDeveloper (BC4J) and Portal integration
Oracle Collaboration Suite– Metadata extraction for OCS Files
Benefits: Save labor, time and money
BioMed Central:•Automated media processing, serving & integration
New Mexico Department of Transportation:• A Single DBA designed, created, deployed, and maintains a 5 TB
image management system
Palazzo Braschi Museum - Rome:• Reduced image processing time by 90% to bulk load and process
images as compared to client side tools.
A US Central bank• On-line processing and rapid resolution of 26,000 bad checks per
day reduces handling and float costs.
Fast & Scalable 1TB image repository renders images in
Web browser in less than 0.4 second
Loads at device speeds Multi-terabyte multimedia databases
– 5 TB database– 140 million images
Scalable bulk load and process– Parallel processes load 300,000 images/hour– Bulk process – tiff to gif conversion, scale to
thumbnail
* USB Paine Webber, Caixa Economica Federal, NM DOT
Secure and Manageable Use all Oracle Database security features
– authentication, auditing, encryption, access control, etc.
Banks and Commercial Web sites use it
One management environment for all data
– Single DBA for 5TB database
– 3TB financial database
* A US Central Bank, BioMed Central, Cre8tiv - UK, Spa
Microsystems – UK, NM DOT, Caixa Economica Federal
Oracle Simplifies Code
With JSP Tag Library: (14 point font)
<ord:embedImage connCache = <% java.util.Vector otherValuesVector = new java.util.Vector();
otherValuesVector.add(fd.getParameter("desc")); otherValuesVector.add(fd.getParameter("loc"));
%> “ mediaParameters = "photo" otherColumns = "description, location" otherValues = "<%=otherValuesVector%>" />
Image Insert using Multimedia JSP Tag Library– An Example
Without: (in 10point font) <FORM ACTION="PhotoAlbumInsert.jsp" METHOD="POST"
ENCTYPE="MULTIPART/FORM-DATA"> Description: <INPUT TYPE="text" NAME="desc"><BR> Location: <INPUT TYPE="text" NAME="loc"><BR> Photo: <INPUT TYPE ="file" NAME="photo"><BR> <INPUT TYPE ="submit" VALUE="submit"></FORM> try { // Parse multipart/form-data formData.setServletRequest( request ); formData.parseFormData();
// Insert new row into database stmt = (OraclePreparedStatement)conn.prepareStatement( "insert into spec_photos ( description, location, photo ) " + " values ( ?, ?, ORDSYS.ORDImage.init() )" ); stmt.setString( 1, formData.getParameter( "description" ) ); stmt.setString( 2, formData.getParameter( "location" ) ); stmt.executeUpdate(); stmt.close(); // Fetch OrdImage object from database stmt =
(OraclePreparedStatement)conn.prepareStatement( "select photo from spec_photos where description = ? for update" );
stmt.setString( 1, formData.getParameter( "description" ) ); rset = (OracleResultSet)stmt.executeQuery(); rset.next(); OrdImage photo = (OrdImage)rset.getCustomDatum( 1, OrdImage.getFactory()); rset.close(); stmt.close(); // Load the photo into the database and set the properties. formData.getFileParameter( "photo" ).loadImage( photo ); // Update object in database stmt = (OraclePreparedStatement)conn.prepareStatement( "update spec_photos set photo = ? where description = ?" ); stmt.setCustomDatum( 1, photo ); stmt.setString( 2, formData.getParameter( "description" ) ); stmt.execute(); stmt.close();
// Commit changes conn.commit(); } finally { // Ensure JDBC connection is released and any temp files are deleted. album.release(); formData.release(); }%>
New Oracle10g Multimedia Features
Standards Support – SQL/MM Still Image
New version of Java Advanced Imaging and
additional image processing operators
Support for additional media formats
– Microsoft ASF, MPEG2 & MPEG4
• Microsoft Windows Media Server Plugin
• Real Server Plugin for Helix Server
• XML DB integration
Proposed Enhancements Parse TIFF headers for user-specified attributes
Metadata mgt., e.g. microarrays, gels, mass spec.
Characterize a region of interest for an image
Plug-in 3rd party algorithms & utilities
Manage media metadata in XML DB
Describe user-defined file formats
Keep a history of changes to images
Handle 3-D images (time/volume)
DICOM Support
Multimedia Database Improves the Bottom Line
Matthew Cockerill
Technical Director
BioMed Central
Session id: 40363
BioMed Central and Oracle
BioMed Central is an Open Access publisher of biomedical research
Oracle database technology used to deliver a cost-effective online publishing solution
Goals– Make the publishing process more efficient
through online tools and automation– Increase accessibility of research by removing
subscription barriers
Oracle technology used by BioMed Central
– XML DB– Oracle interMedia
– Real Application Clusters– Data Guard– Oracle Text
BioMed Central’s database– 70 gigabytes of data (and growing rapidly)– Lots of traditional relational data
(e.g. 250,000 registered users)– Also serves as a repository for images, movies, PDFs
and other rich media
Key technologies used
Oracle technology used by BioMed Central
– XML DB– Oracle interMedia
– Real Application Clusters– Data Guard– Oracle Text
BioMed Central’s database– 70 gigabytes of data (and growing rapidly)– Lots of traditional relational data
(e.g. 250,000 registered users)– Also serves as a repository for images, movies,
PDFs and other rich media
Key technologies used
What is wrong with traditional science publishing? Subscription-only access to scientific research is a legacy
of the economics of print Scientists do all the hard work
– performing the research– writing up the article– acting as peer reviewers– acting as journal editors
Traditional publishers take ownership of the copyright and sell limited access back to the scientific community
In the age of the web that makes no sense for science Open Access publishers make research freely accessible
and redistributable by scientists
Benefits of Open Access
Research instantly accessible to the entire scientific community
Digital permanence (many copies) A route off the subscriptions treadmill
– Subscriptions to traditional journals have increased at 10-15% per annum
Data mining Grid computing
Tony Blair
“[The] national e-science grid … intends to make access to
computing power, scientific data repositories and
experimental facilities as easy as the web makes access to
information.”- Tony Blair, May 2002
The Open Access movement
Public Library of Science– New not-for-profit publisher formed by a group of scientists– Has received $9m from Gordon and Betty Moore
Foundation to start new Open Access journals Soros Foundation
– Has provided $3m to support Open Access publishing in developing and transitional countries
Sabo bill– Congressman Martin Sabo recently introduced the Public
Access to Science Act in Congress– If passed it would ensure that all US federally funded
research would be published with Open Access
BioMed Central architecture Oracle9i Database
– Stores relational data (e.g. user registration info)
– Also acts as repository for files associated with submitted manuscripts published articles
Web server farm– Runs many different journal websites,
all driven by the same Oracle database– Extensive use of Java and XSLT– Media content streamed from the
database using servlets
9i
Key Oracle Technologies used by BioMed Central Real Application Clusters Data Guard Oracle Text XML DB Oracle interMedia
Key Oracle Technologies used by BioMed Central Real Application Clusters Data Guard Oracle Text XML DB Oracle interMedia
Importance of high availability
Science is a global enterprise, so BioMed Central’s websites are busy 24 hours a day
Scientists entrust their research and reputation to us - they must have confidence that their research will be available
Major institutional customers demand high reliability
BioMed Central delivers high availability using a combination of RAC and Data Guard
Real Application Clusters
BioMed Central was one of the first organizations in the UK to deploy 9i RAC
Main database runs on a pair of dual CPU Sun Fire V480 servers
Delivers high availability in the event of single node failure
Oracle upgrades/patches do currently require downtime however (for now!)
Data Guard
BioMed Central uses Data Guard to maintain a standby database
Standby database kept up to date by automated application of log files
Standby database can be used for reporting (in read-only mode)
If a prolonged outage of live db occurs (planned or unplanned), standby database can be activated
Data Guard makes it easy to roll back to the live configuration after planned outages
RAC/Data Guard configuration
RAC Cluster Standby DB(Data Guard)
Web server farm
Main hosting location Standby location
Reporting
logfiles
RAC/Data Guard configuration
RAC Cluster Standby DB(Data Guard)
Web server farm
Main hosting location Standby location
Reporting
Key Oracle Technologies used by BioMed Central Real Application Clusters Data Guard Oracle Text XML DB Oracle interMedia
Use of Oracle Text
High performance full text article search Key benefits
– Ease of maintenance (incremental online indexing)
– Structured searching of XML– XPath support– Unicode aware (smart base-character indexing)– Filter procedures can be used to transform XML
to be indexed
Structured search
XPath search
Prior to Oracle9i Database Release 2, relatively basic field restrictions based on XML tags were possible
Complex nesting of tags, or specific attribute values were difficult or impossible to search for
Oracle9i Database Release 2 support for Xpath field restrictions takes XML searching to another level
Now possible to search for all XML articles that contain a certain path (HASPATH), or that match a certain text expression at that path (INPATH)
XPath example
Article metadata identifying a series of related articles
<meta> <classifications> <classification type="BMC" subtype="review_series_title" id="ar-cell-cell">Cell-cell interactions in synovitis</classification> </classifications> </meta>
SQL syntax to retrieve all articles in that review series
SELECT ARX_ID FROM ARX WHERE CONTAINS (ARX_FULL, 'HASPATH (//classification[@type="BMC“ AND @subtype="review_series_title" AND @id="ar-cell-cell"])')>0;
Smart handling of Unicode
Key Oracle Technologies used by BioMed Central Real Application Clusters Data Guard Oracle Text XML DB Oracle interMedia
XML DB
Oracle support for XML standards in the database allows BioMed Central to manage article XML data within database
Examples of use– Re-validate article XML against DTD after any
update– Application of XSLT transformations within
database (e.g. as a pre-indexing filter)
Article XML (pre-transform)<bibl> <title> Genetic variability in MCF-7 sublines</title> <aug> <au id="A1"> <snm>Nugoli</snm> <fnm>Melanie</fnm> <mi>JK</mi> <email>[email protected]</email> </au> <au id="A2"> <snm>Chuchana</snm> <fnm>Paul</fnm> <email>[email protected]</email> </au> </aug> <source>BMC Medical Research Methodology</source>…</bibl>
Article XML (post-transform)<bibl> <title> Genetic variability in MCF-7 sublines</title> <aug> <au id="A1"> <snm>Nugoli</snm> <fnm>Melanie</fnm> <mi>JK</mi> <bnm>Nugoli_MJK</bnm>
<email>[email protected]</email> </au> <au id="A2"> <snm>Chuchana</snm> <fnm>Paul</fnm> <bnm>Chuchana_P</bnm> <email>[email protected]</email> </au> </aug> <source> <sourcefull>BMC Medical Research Methodology</sourcefull> <sourceabbr>BMC Med Res Methodol</sourceabbr> </source> …</bibl>
Key Oracle Technologies used by BioMed Central Real Application Clusters Data Guard Oracle Text XML DB Oracle interMedia
interMedia: Oracle as a media repository Manuscript submission and workflow involves
a complex interplay of files and metadata Storing files directly in the database as
BLOBs makes their management and manipulation much simpler
interMedia provides a powerful set of tools to work with images in the database
– Extracting image metadata– Scaling/cropping/format conversion
Full text article
Figure streamed from db
PDF streamed from database
Processing submitted files
Using interMedia to manipulate images
ASpeaker NameSpeaker TitleSpeaker TitleOracle Corporation
Q&Q U E S T I O N SQ U E S T I O N SA N S W E R SA N S W E R S
Q&A