Migrate All The Things! - DrupalCon...ABOUT ME • Doing Drupal since 2008 • Love open source and...

Post on 28-Jun-2020

1 views 0 download

Transcript of Migrate All The Things! - DrupalCon...ABOUT ME • Doing Drupal since 2008 • Love open source and...

Our expertise, your digital DNA | evolvingweb.ca | @evolvingweb

MIGRATE ALL THE THINGS!Better Drupal workflows, using Migrate

Dave Vasilevskyvasi@evolvingweb.ca

twitter.com/djvasi @vasi on drupal.org, github

ABOUT ME

• Doing Drupal since 2008

• Love open source and the Drupal community

• Contributor to Migrate in core (eg: multilingual nodes in D6 → D8 upgrades

• Drupal development, consulting and training since 2007 • Based in Montreal, clients all over Canada and USA • Very involved in the Drupal community

http://evolvingweb.ca

Our specialties

Multilingual Solr search Testing

Custom Drupal applications Responsive themes and of course…

Migrate

ONCE UPON A TIME…

ONCE UPON A TIME…

“We’re going to save some money. You just build the site—we’ll add all the content

ourselves.”- The Client

A beautiful site, only missing contentONCE UPON A TIME…

A beautiful site, only missing contentONCE UPON A TIME…

ONCE UPON A TIME…

• Bodies that are only images or tables

• Relationships or tags that weren’t identified

• Much shorter or longer text than expected

• Many more or fewer nodes than expected

So many things can go wrong!

EX: TRENT UNIVERSITY

EX: TRENT UNIVERSITY

A SOLUTION

• Force client to actually write content quickly, improves likelihood of launch on-time

• Client can use tools they know, even without Drupal training

• Can check if IA is correct, and adjust before it’s too late

• Each developer can import data on their own computer

MIGRATE?• Yes, it’s about migrate. So what’s migrate?

MIGRATE?• Yes, it’s about migrate. So what’s migrate?

• A system that allows importing structured data into Drupal, usually as entities: nodes, users, terms…

• Data can come from many different places: DB, CSV, XML…

• Very extensible

• Included in Drupal 8 core (experimental)

• What is migrate not?

MIGRATE?

• There’s already lots of info about what migrations are, how to write a migration:

• Michael Anello at DCNJ 2017: http://tiny.cc/ultimikeMigrate

• Evolving Web blog series: http://tiny.cc/ewMigrate

• These aren’t the focus of the talk

MIGRATE?• This talk isn’t just about how to start. More about why and

when.

• Common impression is that migrate is only for:

MIGRATE?• This talk isn’t just about how to start. More about why and

when.

• Common impression is that migrate is only for:

• Upgrades from D6/D7 to Drupal 8

MIGRATE?• This talk isn’t just about how to start. More about why and

when.

• Common impression is that migrate is only for:

• Upgrades from D6/D7 to Drupal 8

• Moving from legacy sites to Drupal

MIGRATE?• This talk isn’t just about how to start. More about why and

when.

• Common impression is that migrate is only for:

• Upgrades from D6/D7 to Drupal 8

• Moving from legacy sites to Drupal

MIGRATE?

• But there are also many new content workflows that migrate enables

• So don’t just migrate things that feel like migrations—migrate all the things

MIGRATE?

• But there are also many new content workflows that migrate enables

• So don’t just migrate things that feel like migrations—migrate all many of the things

Our expertise, your digital DNA | evolvingweb.ca | @evolvingweb

WORKFLOW: CONTENT FIRSTImplementation

THE PARTS OF MIGRATE

Source Destination

Process

THE PARTS OF MIGRATE

Source Destination

Process

CSVFile

First Last Job

Dave Vasilevsky Backend

Jorge Diaz Frontend

Alex Dergachev Marketing

THE PARTS OF MIGRATE

Source Destination

Process

CSVFile Users

First Last Job

Dave Vasilevsky Backend

Jorge Diaz Frontend

Alex Dergachev Marketing

name field_job

Dave Vasilevsky Backend

Jorge Diaz FrontendAlex Dergachev Marketing

THE PARTS OF MIGRATEProcess

FirstLastJob field_job

nameSource Destination

CSVFile Users

First Last Job

Dave Vasilevsky Backend

Jorge Diaz Frontend

Alex Dergachev Marketing

name field_job

Dave Vasilevsky Backend

Jorge Diaz FrontendAlex Dergachev Marketing

THE PARTS OF MIGRATE

• migrate_plus: Allows a migration to be defined with YAML config file

• ‘Migration’: Definition of source + process + destination

• migrate_tools: Allows running migrations using drush

Contrib modules

id: test label: Test CSV migration source: plugin: csv path: public://test.csv header_row_count: 1 keys: [Title] destination: plugin: entity:node default_bundle: page process: title: Title body: Body

test.csv migrate_plus.migration.test.yml

Title Body

About Evolving Web…

Careers If you're interested…

Projects We work with…

test.csv migrate_plus.migration.test.ymlid: test label: Test CSV migration source: plugin: csv path: public://test.csv header_row_count: 1 keys: [Title] destination: plugin: entity:node default_bundle: page process: title: Title body: Body

Title Body

About Evolving Web…

Careers If you're interested…

Projects We work with…

test.csv migrate_plus.migration.test.ymlid: test label: Test CSV migration source: plugin: csv path: public://test.csv header_row_count: 1 keys: [Title] destination: plugin: entity:node default_bundle: page process: title: Title body: Body

Title Body

About Evolving Web…

Careers If you're interested…

Projects We work with…

test.csv migrate_plus.migration.test.ymlid: test label: Test CSV migration source: plugin: csv path: public://test.csv header_row_count: 1 keys: [Title] destination: plugin: entity:node default_bundle: page process: title: Title body: Body

Title Body

About Evolving Web…

Careers If you're interested…

Projects We work with…

A SOLUTION• Force client to actually write content quickly, improves

likelihood of launch on-time

• Client can use tools they know, even without Drupal training

• Can check if IA is correct, and adjust before it’s too late

• Each developer can import data on their own computer. No database sharing!

What we gain

WORKFLOW: CONTENT FIRST

• Only migrate when it’s worth it

• Spreadsheets are too simple

• Images are hard

• Related content is hard

⚠ DANGER ⚠

WORKFLOW: CONTENT FIRST, IN DRUPAL

• Drupal has great content management UI

• Rich text editor

• Images

• References

• Don’t need complete theme and behaviour to create content, just content type definitions

WORKFLOW: CONTENT FIRST, IN DRUPAL

• Provide client a content staging site

• They can build all the content there

• Migrate it into Drupal, just like before

• Same advantages of “Content first” workflow

WORKFLOW: CONTENT FIRST, IN DRUPAL

[ { "title": "My node", "path": "/node/1", "body": "<p>This is a sample node.</p>\n" }, { "title": "Another node", ... }, ... ]

REST module to export content as JSON

WORKFLOW: CONTENT FIRST, IN DRUPAL

Company

Evolving Web

IBM

Apple

User Company

Dave Evolving Web

Jorge Evolving Web

Employee: Davenid: 9

Employee: Jorgenid: 10

Company: EWnid: 3

3

3

Relationships

WORKFLOW: CONTENT FIRST, IN DRUPAL

• Migrate supports more complex mappings: “process plugins”

• There’s a list of ones in core: https://www.drupal.org/docs/8/api/migrate-api/migrate-process

process: field_name: plugin: concat delimiter: “ “ source: - firstname - lastname

Relationships

WORKFLOW: CONTENT FIRST, IN DRUPAL

Company

Evolving Web

IBM

Apple

User Company

Dave Evolving Web

Jorge Evolving Web

process: field_employer: plugin: migration source: employer migration: companies

Relationships

EX: EVOLVING WEB SITE

EX: EVOLVING WEB SITE

EX: EVOLVING WEB SITE

EX: EVOLVING WEB SITE

WORKFLOW: CONTENT FIRST, IN DRUPAL

• Users can gather complex content, early

• Use polished Drupal UI to do so

What we gain

WORKFLOW: CONTENT FIRST, IN DRUPAL

• Authentication: Shield + migrate_plus

• Translations: Two migrations

• No built-in export: Menus, custom blocks

Complications

WORKFLOW: CONTENT FIRST, IN DRUPAL

• Site doesn’t look pretty

• Once the IA is nailed down—stop!

⚠ DANGER ⚠

WORKFLOW: MIRROR• External tool may be better than Drupal for managing certain

content. Eg:

• Membership management system

• Image management systems

• Tools specific to data type, like course catalog

• Manage data with external system

• Keep Drupal in sync

WORKFLOW: MIRROR

Rendering

Logic

Permissions

Content

WORKFLOW: MIRROR

Rendering

Logic

Permissions

Content

WORKFLOW: MIRROR

Logic

Permissions

Content

Headless Drupal

WORKFLOW: MIRROR

Rendering

Logic

Permissions

WORKFLOW: MIRROR

Rendering

Logic

Permissions

Footless Drupal?

EX: COUNCIL FOR RESPONSIBLE NUTRITION

EX: COUNCIL FOR RESPONSIBLE NUTRITION

EX: COUNCIL FOR RESPONSIBLE NUTRITION

WORKFLOW: MIRROR

• Update what’s changed, and only what’s changed

• Delete entities that are no longer in source

• Trigger without drush

• Make it hard to make mistakes

Recurring migrations

WORKFLOW: MIRROR

• Only update what’s changed: incremental migration

source: track_changes: true

WORKFLOW: MIRROR

• Trigger on cron, or via UI

$migration_manager = \Drupal::service('plugin.manager.config_entity_migration'); $migration = $migration_manager->createInstance('my_migration'); $executable = new MigrateExecutable($migration, new MigrateMessage()); $executable->import();

Triggering migrations without drush

WORKFLOW: MIRROR

• If users are uploading data files, we need to tell migrate about them.

$path = $form_state->getValue(['source', 0])->getFileUri();

$source = $migration->getSourceConfiguration(); $source['path'] = $path; $migration->set('source', $source);

Dynamic source

WORKFLOW: MIRROR• Users want to know what’s about to happen, check if it’s

expected

• We can look at the Migrate source object, and check what’s new/updated/deleted: http://tiny.cc/migrateDryRun

term$ drush scr dry_run.php UPDATED: 2 UPDATED: 4 NEW: 5 DELETED: 3

WORKFLOW: MIRROR

• In core, migrate will never delete items that have been removed from source.

• But now we know what needs deletion, so we can add a hook to do that too.

• BE CAREFUL! Validate your data

Delete

WORKFLOW: MIRROR

• migrate_devel module will print every row before migrating it

• Look at migrate_map_foo and migrate_message_foo DB tables

• Use XDebug and step through!

Debugging

WORKFLOW: MIRROR

• Use the right tool for the right job: Data can live in external tool if it’s better for the job

• Don’t need to write any weird integrations (eg: external_entities)

What we gain

WORKFLOW: MIRROR

• Validate your data!

• Data will be overwritten

• Never do bidirectional sync. It will break.

⚠ DANGER ⚠

WORKFLOW: CONTENT IN GIT

• Example of an external system: Markdown files in a git repo

EX: ALLSEEN DEV DOCS

term$ git log commit 944c5ff6d776c0c2b40af0b4e6e3a519ddde9dc8 Author: xxxx <xxx.com> Date: Thu Jul 16 15:57:52 2015 -0400

generate_scripts.js fixups to work in Refresh, refs #11506

commit df54b1a275d6a6f4aab70c4180fe146b383cd14a Merge: 76fe40a af2fd60 Author: xxxx <xxx.com> Date: Thu Jun 25 19:31:41 2015 +0000

Merge "ASADOC-51: figure 2-4 in the security 2.0 HLD is incorrect"

commit af2fd60b9e2668255dce27be769bb21d5fd14722 Author: xxxx <xxx.com> Date: Tue Jun 23 16:42:22 2015 -0700

ASADOC-51: figure 2-4 in the security 2.0 HLD is incorrect Move ConfigureWiFi call after claim process block Change-Id: I1afa5334092fc0431fc7d709a202c3a6a55ff72e

EX: ALLSEEN DEV DOCS

term$ git log commit 944c5ff6d776c0c2b40af0b4e6e3a519ddde9dc8 Author: xxxx <xxx.com> Date: Thu Jul 16 15:57:52 2015 -0400

generate_scripts.js fixups to work in Refresh, refs #11506

commit df54b1a275d6a6f4aab70c4180fe146b383cd14a Merge: 76fe40a af2fd60 Author: xxxx <xxx.com> Date: Thu Jun 25 19:31:41 2015 +0000

Merge "ASADOC-51: figure 2-4 in the security 2.0 HLD is incorrect"

commit af2fd60b9e2668255dce27be769bb21d5fd14722 Author: xxxx <xxx.com> Date: Tue Jun 23 16:42:22 2015 -0700

ASADOC-51: figure 2-4 in the security 2.0 HLD is incorrect Move ConfigureWiFi call after claim process block Change-Id: I1afa5334092fc0431fc7d709a202c3a6a55ff72e

EX: ALLSEEN DEV DOCS

WORKFLOW: CONTENT IN GIT

• In Drupal 8, almost everything is just a plugin

• That goes for Migrate sources and process plugins too

• They’re pretty well-designed, don’t be afraid to write one

Getting all the files

/** * @MigrateSource( * id = "markdown" * ) */ class Markdown extends SourcePluginBase {

public function initializeIterator() { $found = file_scan_directory($this->configuration['path'], '/.md$/'); foreach ($found as $item) { yield (array) $item; } }

public function __toString() { return $this->configuration['path']; }

public function fields() { return ['uri' => $this->t('URI')]; }

public function getIds() { return ['uri' => ['type' => 'string']]; }

}

WORKFLOW: CONTENT IN GIT

• Our client kept a hierarchical YAML file, to turn into a menu

• But migrate wants a list of rows, not a tree!

• RecursiveIteratorIterator makes this simple: http://tiny.cc/RecIterIter

Source data can be weird

WORKFLOW: CONTENT IN GIT

• Catch problems early: Check the resulting entities, see if they make sense

• Are there broken links? Orphan nodes, without a menu item?

• Compare generated test site with Drupal: https://github.com/evolvingweb/sitediff

Validation

WORKFLOW: CONTENT IN GIT

• Best tools for the job

• Git comes with: branches, merging, global ID, blame…

• Can use whichever Markdown editor they like

• Can revert to earlier version of site!

What we gain

WORKFLOW: PARTIAL UPGRADES• A D6/7 to D8 upgrade by default includes complete config/

content

• This leaves cruft, eg; obsolete fields

• Things have changed in ten years! You probably want new content

• Building the new site from scratch means rewriting all your content

WORKFLOW: PARTIAL UPGRADES

• Build upgraded site from scratch, using new best-practices

• But keep some content. Only what you need!

WORKFLOW: PARTIAL UPGRADES

term$ drush migrate-upgrade --configure-only —legacy-db-url=mysql://localhost/test Exporting d7_dblog_settings as upgrade_d7_dblog_settings Exporting d7_filter_format as upgrade_d7_filter_format Exporting d7_filter_settings as upgrade_d7_filter_settings Exporting d7_global_theme_settings as upgrade_d7_global_theme_settings Exporting d7_image_settings as upgrade_d7_image_settings Exporting d7_image_styles as upgrade_d7_image_styles Exporting d7_language_types as upgrade_d7_language_types Exporting d7_node_settings as upgrade_d7_node_settings ...

There’s a shortcut!

WORKFLOW: PARTIAL UPGRADES• Export config, delete almost all the migrations.

• Only keep what you want to save, eg:

• Some content types: upgrade_d7_node_page

• Tags: upgrade_d7_taxonomy_term_tags

• These contain everything you need to migrate those types, even the hard parts of the migration

• Modify as needed.

• Eg: If you don’t migrate users, don’t keep the uid in the migrations

WORKFLOW: PARTIAL UPGRADES

• The D6/7 to D8 migrations include every file on the site. Could be many GBs!

• It can be worth only keeping the ones referred to by content you’re keeping:

• http://tiny.cc/migrateFilterFiles

Filtering files

WORKFLOW: PARTIAL UPGRADES

• URLs that break hurt SEO a lot

• Keep the URL alias migration. Migrate redirects as well.

• Since you’re throwing out some content, some links will die

• Have a plan for adding redirects

• Check your site afterwards for broken links

• Proactive: Sitediff https://github.com/evolvingweb/sitediff

• Reactive: Google Webmaster Tools

Broken links

EX: EVOLVING WEB SITE

EX: EVOLVING WEB SITEid: d7_blog label: 'Nodes (Story)' process: uid: plugin: migration migration: d7_user source: node_uid revision_timestamp: timestamp body/0/value: - source: body/0/value plugin: regex pattern: 'http://evolvingweb\.ca/sites/default/files/([^"]*)' replace: '/sites/default/files/d7/\3"'

EX: EVOLVING WEB SITEid: d7_blog label: 'Nodes (Story)' process: uid: plugin: migration migration: d7_user source: node_uid revision_timestamp: timestamp body/0/value: - source: body/0/value plugin: regex pattern: 'http://evolvingweb\.ca/sites/default/files/([^"]*)' replace: '/sites/default/files/d7/\3"'

EX: EVOLVING WEB SITEid: d7_blog label: 'Nodes (Story)' process: uid: plugin: migration migration: d7_user source: node_uid revision_timestamp: timestamp body/0/value: - source: body/0/value plugin: regex pattern: 'http://evolvingweb\.ca/sites/default/files/([^"]*)' replace: '/sites/default/files/d7/\3"'

WORKFLOW: PARTIAL UPGRADES

• Keep the content that’s worth keeping

• Keep URLs working

• Throw out the junk, build a beautiful new site

• Can merge data, as long as there are no shared references

What we gain

WHAT YOU JUST SAW• Intro to migrate

• Workflows

• Content first

• Content first, in Drupal

• Mirror

• Content in git

• Partial upgrades

Our expertise, your digital DNA | evolvingweb.ca | @evolvingweb

WHAT YOU JUST SAWMigrate can improve your content workflow!

WHAT TO DO NOW

• Find me on Twitter @djvasi

• Learn more about migrate

• Try out these content workflows

WHAT TO DO NOW

• Public: Chicago, Montreal, Toronto, NJ, NYC

• Private: Health Canada, McGill University, remote

• For new content editors, new developers, teams starting on D8…

Evolving Web training

JOIN US FOR CONTRIBUTION SPRINTS

First-Time Sprinter Workshop9:00am-12:00pmRoom: 307-308

#drupalsprints

Friday, April 28, 2017

Mentored Core Sprint9:00am-12:00pmRoom:301-303

General Sprints9:00am-6:00pmRoom:309-310