Assembling, Repurposing And Manipulating Document Content Using The New Office File Format Brian...
-
Upload
kari-dodge -
Category
Documents
-
view
214 -
download
0
Transcript of Assembling, Repurposing And Manipulating Document Content Using The New Office File Format Brian...
Assembling, Repurposing And Assembling, Repurposing And Manipulating Document Content Manipulating Document Content Using The New Office File FormatUsing The New Office File Format
Brian JonesBrian JonesOFF 304OFF 304Program ManagerProgram ManagerMicrosoft CorporationMicrosoft Corporation
AgendaAgenda
Overview of the new formatsOverview of the new formats
Role of XML in documentsRole of XML in documents
Evolution of MS Office file formatsEvolution of MS Office file formats
Microsoft Office Open XML format architectureMicrosoft Office Open XML format architecture
Components of the new formatsComponents of the new formats
Reference schemas and custom defined schemasReference schemas and custom defined schemas
Developing against the formatsDeveloping against the formats
Visual Studio Tools for Office (VSTO) supportVisual Studio Tools for Office (VSTO) support
Sample solution scenariosSample solution scenarios
Demos throughoutDemos throughout
Microsoft Office Open XML FormatsMicrosoft Office Open XML Formats
New Default Formats: New XML file formats for Word, Excel and New Default Formats: New XML file formats for Word, Excel and PowerPointPowerPoint
New file type extensionsNew file type extensions
Interoperable: Open, transparent format improves interoperabilityInteroperable: Open, transparent format improves interoperabilityPublished file format specification with royalty-free licensePublished file format specification with royalty-free licenseTransparent, XML format enables new integration scenarios for documents Transparent, XML format enables new integration scenarios for documents and LOB systemsand LOB systems
Added Benefits: compact and robustAdded Benefits: compact and robustZIP container allows for standard compression on all files without user ZIP container allows for standard compression on all files without user effort (Dramatic file size improvements)effort (Dramatic file size improvements)Significantly more robust files to help minimize data lossSignificantly more robust files to help minimize data loss
Backward Compatible: Office 2000, Office XP, Office 2003 will all Backward Compatible: Office 2000, Office XP, Office 2003 will all support the new formatssupport the new formats
Patches for compatibility available by launchPatches for compatibility available by launchOpen, edit and save new formatsOpen, edit and save new formats
Legacy support: Current Office 97-2003 binary file formats supportedLegacy support: Current Office 97-2003 binary file formats supportedSupport for XML formats from Office 2003, Office XP continuedSupport for XML formats from Office 2003, Office XP continued
Developers: Endless potential for developers Developers: Endless potential for developers Build solutions to read, write, and modify Office files (without the need to Build solutions to read, write, and modify Office files (without the need to run run Office APIs)Office APIs)
Office Open XML FormatsOffice Open XML Formats
Brian JonesBrian JonesProgram ManagerProgram ManagerMicrosoft WordMicrosoft WordDemo 1 & 2Demo 1 & 2
The Role Of XML With The Role Of XML With DocumentsDocumentsScenarioScenario ExampleExampleDocument AssemblyDocument AssemblyServer-based or user-assisted Server-based or user-assisted construction of documents from archived construction of documents from archived content or database contentcontent or database content
Create sales reports from financial and Create sales reports from financial and forecast data stored in a CRM systemforecast data stored in a CRM system
Content ReuseContent ReuseMuch easier to move content between Much easier to move content between documents, including different document documents, including different document typestypes
Apply content stored in Word documents Apply content stored in Word documents to Web pages quickly and efficientlyto Web pages quickly and efficiently
Content TaggingContent TaggingAdd domain-specific metadata to Add domain-specific metadata to document content to enable custom document content to enable custom solutions solutions
Tag presentations using a specific Tag presentations using a specific taxonomy to improve knowledge taxonomy to improve knowledge management efficiency management efficiency
Document InterrogationDocument InterrogationQuery document repositories based on Query document repositories based on custom data, content types or document custom data, content types or document metadatametadata
Search for all documents containing a Search for all documents containing a specific company name or sales contactspecific company name or sales contact
Document SanitizationDocument SanitizationRemove unwanted content like Remove unwanted content like comments or embedded code from your comments or embedded code from your document when appropriatedocument when appropriate
Remove all tracked changes and Remove all tracked changes and comments from a Word document before comments from a Word document before it is publishedit is published
Open XML Formats Open XML Formats ArchitectureArchitecture
User view: single Office User view: single Office “file”“file”
Questionnaire.Questionnaire.docxdocx
File ContainerFile Container
Document PropertiesDocument Properties
CommentsComments
ChartsCharts
Embedded code / macrosEmbedded code / macros
Images, video, soundImages, video, sound
Custom-defined XMLCustom-defined XML
WordML / SpreadsheetML, etc.WordML / SpreadsheetML, etc.Document PartsDocument Parts
Most parts are XMLMost parts are XML
Each XML part is a discreet, compressed Each XML part is a discreet, compressed componentcomponent
Can add, extract and modify individual Can add, extract and modify individual parts without using Office programsparts without using Office programs
Corruption or absence of any part would Corruption or absence of any part would not prohibit the file from being openednot prohibit the file from being opened
Developer view: modular Developer view: modular filefile
Create A Document From Create A Document From ScratchScratch
Brian JonesBrian JonesProgram ManagerProgram ManagerMicrosoft WordMicrosoft WordDemo 3Demo 3
Components Of The New Components Of The New FormatsFormats
We make heavy use of the Open Packaging ConventionsWe make heavy use of the Open Packaging ConventionsThese are the same conventions used by the XPS guys, and you These are the same conventions used by the XPS guys, and you can leverage the same APIs for accessing Office filescan leverage the same APIs for accessing Office files
Package – ZIP ContainerPackage – ZIP Container
Part – The “files” inside the ZIPPart – The “files” inside the ZIP
Content Types – Each part has a content type that is enforced Content Types – Each part has a content type that is enforced on openon open
Relationships – Any part that references another part must do Relationships – Any part that references another part must do so via a relationshipso via a relationship
Document Properties
Application Properties
Custom Doc. Props.
Workbook
Sheet 2
Sheet 3
Sheet 1 Styles
Chart
Strings
Relationship
...
...
Modifying An Excel Modifying An Excel SpreadsheetSpreadsheet
Brian JonesBrian JonesProgram ManagerProgram ManagerMicrosoft WordMicrosoft WordDemos 4 & 5Demos 4 & 5
The Role Of XMLThe Role Of XMLReference and custom-defined schemasReference and custom-defined schemas
XML Reference XML Reference SchemasSchemas
Display-oriented (e.g. Display-oriented (e.g. Bold, Italics, Tables, Bold, Italics, Tables, Paragraphs, Styles)Paragraphs, Styles)
Open Document FormatOpen Document Format
Enable Archival & File Enable Archival & File Formats InteroperabilityFormats InteroperabilityCustom-defined Custom-defined
SchemasSchemasData-oriented (e.g., Data-oriented (e.g., Price, Invoice)Price, Invoice)
Represents the business Represents the business information stored in the information stored in the documentdocument
Enable System Enable System IntegrationIntegration
The Role Of XMLThe Role Of XMLReference and custom-defined schemasReference and custom-defined schemas
XML Reference XML Reference SchemasSchemas
Display-oriented (e.g. Display-oriented (e.g. Bold, Italics, Tables, Bold, Italics, Tables, Paragraphs, Styles)Paragraphs, Styles)
Open Document FormatOpen Document Format
Enable Archival & File Enable Archival & File Formats InteroperabilityFormats Interoperability
<w:p> <w:r> <w:rPr><w:b /></w:rPr> <w:t>John Doe</w:t> </w:r> <w:r> <w:rPr><w:i /></w:rPr> <w:t>Health Agency</w:t> </w:r></w:p>
<ConferenceReport> <Date>3/24/2004</Date> <Summary> <Keyword>XML Conference (Europe)</Keyword> <Abstract>Role of XML on the Desktop</Abstract> </Summary> <Attendees> <Attendee Name=“John Doe”> <Department>Health Agency</Department> <Potential> <Sales>100</Sales> <Growth>25%</Growth> …
</Attendee>
The Role Of XMLThe Role Of XMLReference and custom-defined schemasReference and custom-defined schemas
Custom-defined Custom-defined SchemasSchemas
Data-oriented (e.g., Price, Data-oriented (e.g., Price, Invoice)Invoice)
Represents the business Represents the business information stored in the information stored in the documentdocument
Enable System Enable System IntegrationIntegration
XML Data StoreXML Data Store
Brian JonesBrian JonesProgram ManagerProgram ManagerMicrosoft WordMicrosoft WordDemo 7Demo 7
Developing Against The Developing Against The FormatsFormats
More Reliable SolutionsMore Reliable Solutions33rdrd party tools were main cause of document corruptions party tools were main cause of document corruptions
Fully Documented FormatsFully Documented FormatsFreely available for download with a royalty free licenseFreely available for download with a royalty free licenseOffice file format schemas - Office file format schemas - Used to validate content for a given part Used to validate content for a given part
Samples, samples, samples Samples, samples, samples In the form of code “snippets” for easier use and integration into your In the form of code “snippets” for easier use and integration into your VSTO solutions VSTO solutions
WinFx Packaging APIs WinFx Packaging APIs Office Open XML Formats use the Open Packaging ConventionsOffice Open XML Formats use the Open Packaging ConventionsAccess/maintain parts and relationships within a fileAccess/maintain parts and relationships within a fileTakes care of all ZIP level functionalityTakes care of all ZIP level functionality
XPath XPath Navigation within content Navigation within content
XML DOM XML DOM Manipulating content Manipulating content
Office Open XML Resource KitOffice Open XML Resource KitTools for constructing and deconstructing the new file formatsTools for constructing and deconstructing the new file formatsDesign time Validation toolDesign time Validation tool
Parses a file and reports on schema, relationship errors and warnings Parses a file and reports on schema, relationship errors and warnings Runtime serialization tool Runtime serialization tool
Flattens package into a single file for ease of development in simple construction Flattens package into a single file for ease of development in simple construction scenarios scenarios
Programming Against The FormatsProgramming Against The Formats
Brian JonesBrian JonesProgram ManagerProgram ManagerMicrosoft WordMicrosoft WordDemo 8Demo 8
VSTO Support For XML VSTO Support For XML FormatsFormats
VSTO application manifest becomes a VSTO application manifest becomes a partpart
Enables easier deployment and Enables easier deployment and redeployment of VSTO solutionsredeployment of VSTO solutions
Cached data feature of VSTO will be Cached data feature of VSTO will be fully supported in new file formatsfully supported in new file formats
VSTO’s ServerDocument object will VSTO’s ServerDocument object will be able to manipulate the new file be able to manipulate the new file formats without starting Office formats without starting Office applicationsapplications
Sample Solution ScenariosSample Solution Scenarios
Data interoperabilityData interoperability
Content manipulationContent manipulation
Content sharing and reuseContent sharing and reuse
Document assemblyDocument assembly
Document securityDocument security
Managing sensitive informationManaging sensitive information
Document stylingDocument styling
Document profiling Document profiling
Next StepsNext Steps
Schemas: Sneak peak at the Office Schemas: Sneak peak at the Office “12” schemas“12” schemas
We will provide an initial draft of the We will provide an initial draft of the schemas by the end of this week. See schemas by the end of this week. See my blog for my blog for more detailsmore details
Beta 1Beta 1Register for Beta 1 which comes out in Register for Beta 1 which comes out in the fourth quarter of this yearthe fourth quarter of this year
Recommended Sessions & Recommended Sessions & LabsLabs
Upcoming SessionsUpcoming SessionsOFF 316OFF 316: Word 12: Integrating Business Data into : Word 12: Integrating Business Data into Documents using XML-based Data/View Separation and Documents using XML-based Data/View Separation and ProgrammabilityProgrammability
Tristan Davis - Room 406AB (Thursday @ 11:30)Tristan Davis - Room 406AB (Thursday @ 11:30)PRS 333PRS 333: Advances in Document Workflow, Securing, : Advances in Document Workflow, Securing, Viewing, and Printing Your ContentViewing, and Printing Your Content
Gregg Brown - Room 502AB (Wednesday @ 3:15)Gregg Brown - Room 502AB (Wednesday @ 3:15)OFF 322OFF 322: Building a Solution Using a Spreadsheet in : Building a Solution Using a Spreadsheet in Server-Based ScenariosServer-Based Scenarios
Danny Khen - Room 404AB (Friday @ 8:30)Danny Khen - Room 404AB (Friday @ 8:30)
Prior SessionsPrior SessionsOFF 201OFF 201: Office “12”: Introduction to the Programmable : Office “12”: Introduction to the Programmable Customization Model for the Office “12" User Experience Customization Model for the Office “12" User Experience (Part 1)(Part 1)
Jensen Harris - Room 515AB (Today @ 1)Jensen Harris - Room 515AB (Today @ 1)OFF 302OFF 302: Office “12'': Developing with the Programmable : Office “12'': Developing with the Programmable Customization Model for the Office “12" User Experience Customization Model for the Office “12" User Experience (Part 2)(Part 2)
Andy Himberger - Room 402AB (Today @ 2:45)Andy Himberger - Room 402AB (Today @ 2:45)DAT 304DAT 304: Unleashing the Power of XPS-Based File Formats : Unleashing the Power of XPS-Based File Formats for your Applicationfor your Application
Jesse McGatha - Room 408AB (Today @ 2:45)Jesse McGatha - Room 408AB (Today @ 2:45)
ResourcesResources
Office Preview Site: http://www.microsoft.com/office/preview/
Brian Jones’s Blog: http://blogs.msdn.com/Brian_Jones/
Office 2003 Reference Schema Information: http://www.microsoft.com/office/xml/
Office Developer Center: http://msdn.microsoft.com/office/
QuestionsQuestions
© 2005 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.