Migrations Automating Drupal · migration in this presentation, most if not all of these ideas...
Transcript of Migrations Automating Drupal · migration in this presentation, most if not all of these ideas...
Automating Drupal Migrations
How to go from an Estimated One Week to Two Minutes Down Time
About Dan Harris
● Founder Webdrips.com ○ Drupal-based web design and development shop ○ Founded in July, 2011. ○ Nine years Drupal experience○ 21 years professional experience.
● Twitter @webdrips● Email [email protected]
Note About the Migration Process
Although we’re covering a Drupal 6 to 7 migration in this presentation, most if not all of these ideas presented here should work for any Drupal to Drupal migration.
Overview: Initial Plan/Estimates
● Initial estimate: one week of downtime● SQL queries would be used to export/import
when coverage was limited with Drupal Migrate
● Only automation provided by Migrate Modules● Existing Drupal 7 Architecture
Overview: Updated Plan
● Virtually zero downtime ○ Intermediate: asking for one day down time or less
● Complete migration in one business day● Over 99% automated ● D7 site to be built during migration from
scratch
About the Drupal 6 Site
● Architecturally, was a mess (Frankensite)○ Migration provided chance to clean up architecture
and code● Six custom themes (1 custom/5 subthemes)● 35 custom modules● 151 contributed modules
About the Drupal 6 Site
● 1000 privileged users● About 400k non-privileged users● 25 Content Types, including Webforms● Over 2,500 pages
About the Drupal 7 Site
● 106 Modules ● Bootstrap Primary Theme● One Bootstrap subtheme, Four sub-
subthemes ● Six content types only● 11 Features provided architecture
Automated Migration Process
Requirements● Migrate modules: migrate, migrate_extras, migrate_d2d,
migrate_webform● Import modules: menu_import, path_redirect_import● Four custom modules● Scripts migration and deployment● Fast server with SSD
Migration Script OverviewRequirements:● Create new Drupal D7 site● Build out site architecture with features● Enable Modules● Migrate D6 to D7● Import items that couldn’t be migratedThis provided for a repeatable/reliable process
Migration Script Highlights (Review)
Build the site:drush site-install
Enable features and modules:drush en feature_name -y
Migrate each entity:drush mi entity
Custom Migration Modules
1. Disable “edits” to the D6 sitea. Basically re-direct webform pages, admin pages,
and paths like node/add, node/edit, etc.2. Views (implemented with features) only for
migration status and post-processing3. Migrate_d2d module4. CSV-based Migration
Drupal Migrate/D2D/Extras
● Handled most of the heavy lifting○ Everything except menu links, path redirects, and
slide shows● Extensive drush support ● Plenty of methods available to massage data● D2D: simplifies migration code
Migrating Users
Challenges● Nearly 400K unprivileged users● Needed to assign users to organic groups
○ Based on how webform questions answered● Had to fix user passwords
○ Fixed by writing directly to the user table inside the migration
Migrate Users Code
Unprivileged vs. Privileged was a simple query:class NvidiaPrivilegedUserMigration extends NvidiaUserMigration {
protected function query() {
$query = parent::query();
$query->condition('u.mail', '%nvidia.com', 'LIKE/NOT LIKE');
return $query;
}
}
Migrate Users Code
Fix the password:public function complete($account, $row) {
parent::complete($account, $row);
$account->pass = $row->pass;
db_update('users')
->fields(array('pass' => $account->pass))
->condition('uid', $account->uid)
->execute();
$this->nvidia_memberships($row);
}
Assign Users to Groups (Review) public function nvidia_memberships($row) {
$membership_query = Database::getConnection('default', 'd6source')->select('webform_submissions', 'ws');
$membership_query->join('webform_submitted_data', 'wd', 'wd.sid = ws.sid');
$membership_query->fields('wd', array('cid'));
$membership_query->fields('ws', array('nid'));
$membership_query->addExpression('group_concat(data)', 'data');
$membership_query->groupBy('ws.sid');
$membership_query->groupBy('cid');
$membership_query->condition('ws.uid', $row->uid);
$membership_query->condition('ws.nid', array(1234567,2345678,3456789,4567890,5678901), 'IN');
$membership_id = nvidia_og_membership_associate_user_with_program();
Node Migration Challenges
● Body images & links with absolute paths● Empty fields sometimes caused display issues● Had to deal with “interesting” architecture
decisions on the D6 site● Moved larger files to the cloud● Reduced the number of content types
Node Migration Code
Dealing with textarea images:● Needed to use Simple HTML DOM Parser● Code Review
How a Strange Dev. Decision can Affect a Migration
D6 product page and dB variables table (review) led to the following code$variable_name = 'nvidia_product_disable_product_image_'.$row->nid;
// drush_print_r($variable_name);
$query = Database::getConnection('default', 'd6source')
->select('variable', 'v')
->fields('v', array('name', 'value'))
->condition('v.name', $variable_name, '=')
->execute()
->fetchAll();
$product_image_disabled = $query[0]->value;
if ($product_image_disabled == 'i:1;') {
$row->field_inline_image = NULL;
}
Remove Empty Textarea Fieldspublic function prepare($entity, stdClass $row) {
foreach ($row as $key => $value) {
if (!isset($row->$key) || $row->$key === null) {
$entity->$key = NULL;
}
}
}
“Non-Standard” Entity Migrations (Review)
● D2D handles established Drupal entities well○ nodes, users, taxonomy, etc.
● But what if you want to migrate block content to an entity?○ CSV Migration to the rescue
Challenges
● Biggest challenge was reducing the migration time○ The original estimate just for migrating users was
over 40h○ Eventually that time was reduced to ~ 3 hours○ We tweaked my.cnf, php.ini, drush.ini○ Got a really fast server with Intel Xeon processors,
fast RAM, and a SSD
Challenges
● Installation of modules in order○ circular dependencies○ features that add fields need to be installed before
migration● Relationships between content
○ Both nodes need to exist before creating a relationship
○ “Parent” content that did not exist in original site
Migration timeline● -7days to release: Content freeze● -2days: Automated rebuild, content migration
and editorial approval.● -8h: Registration lockdown and migration
start● -2h: Batch processing of content by editors
and final tests
Accelerating migration
● Use Drush● Single pass for each item
○ Migration objects are big and slow○ Don’t load an object from DB twice
● Multithreading○ https://www.deeson.co.uk/labs/multi-processing-part-2-how-make-migrate-move
Add multithreading to a working migration class
● Not very portable○ needs a Drush extension○ needs to run on the ‘fast’ server
● Very effective
Add multithreading to a working migration class● Sub-class the migration● Make all the sub-migrations use the same
index● Make the sub-migration work on a small
‘chunk’ of the index● Break the migration in parts and send
chunks of it to multiple threads
Add multithreading to a working migration class<?php
class NVMultiThread extends NvidiaUnprivilegedUserMigration {
public function __construct($args) {
$args += array(
'source_connection' => NVIDIA_MIGRATE_SOURCE_DATABASE,
'source_version' => 6,
'format_mappings' => array(
'1' => 'filtered_html',
'2' => 'full_html',
'3' => 'plain_text',
'4' => 'full_html',
),
'description' => t('Multithreaded Migration of users from Drupal 6'),
'role_migration' => 'Role',
);
This is boilerplate needed by D2D
Add multithreading to a working migration class
parent::__construct($args);
$this->limit = empty($args['limit']) ? 100 : $args['limit'];
$this->offset = empty($args['offset']) ? 0 : $args['offset'];
$this->map = new MigrateSQLMap('nvidiaunprivilegeduser',
array(
'uid' => array(
'type' => 'int',
'unsigned' => TRUE,
'not null' => TRUE,
'description' => 'User migration reference',
),
),
MigrateDestinationUser::getKeySchema()
);
}
map/index table
index definition
Add multithreading to a working migration class
protected function query() {
$query = parent::query();
$query->range($this->arguments['offset'], $this->arguments['limit']);
return $query;
}
}
Modify original query to limit the number of items to work on
Measuring the improvement
● Same server● Restore destination DB from backup after
each run● Same source DB● Both DBs in the same server● MySQL optimizations for concurrency issues
Measuring the improvement
1000 rows, 100 per threadThreads Time Speed
1 71s 845/min
2 60s 1000/min
3 54s 1111/min
Measuring the improvement
10,000 rows, 1000 per threadThreads Time Speed
1 707s 848/min
2 303s 1980/min
3 300s 2000/min
4 291s 2061/min
5 351s 1709/min
Measuring the improvement
50,000 rows, 5000 per threadThreads Time Speed
3 1990s 1507/min
4 1562s 1920/min
5 1303s 2302/min
6 1637s 1832/min
Conclusion
● Drop DNS TTL to 1 minute days before launch
● Repeatability is key● Migration is very powerful but can be slow● Automation helps drop downtime close to
zero
Conclusion
● Ask for help● There’s many ways to use Migration, if one
way is not working drop it and use it differently○ CSV vs direct read from DB
● Weird things happen with orphaned fields