CM 7 Territoires et Migrations: 1. Généralités sur les migrations
Content Migrations: Getting from A to B
-
Upload
blend-interactive -
Category
Technology
-
view
501 -
download
0
description
Transcript of Content Migrations: Getting from A to B
Content Migrations:A Field Guide
• Author of Website Migration Handbook v2
• First large migration: World Bank (1,000+ subsites)
• Consults to large and medium organizations
• David guides complex website transformations.
Deane Barker
• Working in content management since 1996
• Founding partner in Blend Interactive• Board member of Content
Management Professionals
Planning vs. Technical
• The planning process encompasses the entire scope of your migration effort
• The technical process is just one very critical part of this process
Agenda
• David will discuss the larger planning process– Break
• Deane will follow with a discussion about the specific technical challenges– End at 4:00 p.m.– Deane and David will be available for
discussion until 5:00 p.m.
Ask Questions
Getting from A to B
It’s painful.
[The End]
Requirements for Transfer
• You know–…what is being moved–…how it has to change on the way over–…how it fits back together on the other
side
Agenda
• Original Content vs. Derived Content• Content Geography• The Four Tasks of Content Transfer• Automated vs. Manual Import• The Automated Import Process• QA Automation
Original Content vs. Derived Content
Some HTML has to be moved.
Some HTML will be generated by your new system as content is imported.
Index Pages vs. Content Pages
Many pages on your new site are not rendered via content, but via
development.
Before you begin transfer, make sure you know which pages are derived and you have made plans to generate those in the new system.
Content Geography
Content has different levels of “geography”
Some content is very specifically placed, while other content is automatically organized.
Home
Products
Product A
Product B
About
History
Press Release
Highly-geographical content is much harder to migrate.
You have to migrate both the content and the placement.
Pop Quiz:Why are blogs so easy to migrate?No geography.Lots of derived index pages.
Hierarchical content requires you to determine and transfer structure
Home
Products
Product A
Product B
About
History
Stub Mapping
Existing Home
Products
Product A
Product B
About
History
New
The Path to Stub Mapping
• “We need to codify the new website structure…”
• “…let’s just store this in the new CMS…”
• “…and let’s store the old URL, just for reference…”
• “…and…can we just use that old URL to transfer the content?”
The Four Tasks of Content Transfer
The Four Tasks
• Extract• Transform• Import• Normalize
• We can generalize about the first two– Extract and transform are platform-
agnostic
#1: Extract
• Get content out of the existing system
• Break content into its necessary components
• Store in a neutral format– XML, usually
Migrating out of a CMS is a lot easier than the alternative.
CMS enforces at least some consistency.
Are you going to extract from the repository level or the publication
level?
Repository vs. Publication Extraction
RepositoryHTML
Processing
You may need to make changes to your old site to make
extraction easier or more complete.
You do not have to wait for anything to do this.
You can start extraction on the very day you decide to migrate your website.
#2: Transform
• Modify extracted content• Fix legacy problems with the content• Adapt content to fit the new
architecture• Neutralize idiosyncrasies in the
content
Content Transformation
Common Transformations
Common Transformations
#3: Import
• Move post-transformed content from a neutral format into the new system
• This is different for every CMS• This capability should be part of the
evaluation process
#4: Normalize
• Fix problems that are only “fixable” once content is in its new home
• Ex:– Relationship reconstruction– URL resolution– Navigation reconstruction
Content relationships can introduce chicken-egg
problems.
How will URLs change on the new platform?
If you content is interlinked how are you going to keep all those links valid?
Embedded URLs
Embedded URL Resolution
• If you have embedded URLs, they are now broken.
• How do you “re-connect” these URLs to the correct content?
• Usually performed as some kind of batch job.– You rarely get 100% accuracy.– Prepare to catch the remainder in QA.
Always store the old URL for a migrated page of content.
How it Works
• Iterate over every piece of content…• …then iterate over every single
property looking for anything that might contain links…
• …then iterate over all those links looking for the new content holding that old link…
• …then correct the link.
Once migrated, use the old URL to do a lookup in your 404 handler.
If you can preserve binary file URLs, do so. Your new CMS will likely make
this easier.
Depending on volume, menu reconstruction might be a manual process.
Automated vs. Manual Import
What is the actual mechanism of movement?
Copy-and-paste?Automated?
When Copy-and-Paste Works
• When you don’t have a lot of content• When you have access to cheap
labor• When your content is highly
geographic• When you cannot automate
transformation• When you have enough resources for
sufficient QA
When Automated Migration Works
• When you have large volumes of content
• When your content is not highly-geographic
• When you have sufficient technology and/or development resources
You don’t have to use the same method for your entire project.
The Automated Migration Process
Automated Migration Tools
• Great answer to the Transfer phase• Less of an answer to everything else• They still have to be configured and
tested
The Promise:
You will be able to develop a script that will reduce your migration to a button-click.
The Promise:
You will run this script, need to do nothing else, then launch your new website.
The Value-Add
• A scripting environment• Tested tools for:– Extraction– Transformation– Import (maybe…)
• Professional services
$$$$
Automated Migration Process
• Develop automated migration script– Configure– Execute– Evaluate– (Repeat)
• Accept a cycle “as good as is reasonable”• Perform necessary manual editing• Re-do changes during content freeze• Launch
Automated migrations are highly iterative.
Configure-Execute-Evaluate
Automated Migration Cycle
Configure Execute Evaluat
eManual Editing
Iterate again…
Launch
Weeks? Months? Days? Minutes?
“As good as is reasonable…”
Once you accept the output of a migration cycle, you are in a content
freeze
Handling a Content Freeze
• Don’t change any content on the existing site
• Track changes so they can be re-changed on the new site
QA Automations
Ideally, track the QA process inside the CMS
itself.