Make IdsToBigints (mostly!) non-blocking (#5088)
* Make IdsToBigints (mostly!) non-blocking
This pulls in GitLab's MigrationHelpers, which include code to make
column changes in ways that Postgres can do without locking. In general,
this involves creating a new column, adding an index and any foreign
keys as appropriate, adding a trigger to keep it populated alongside
the old column, and then progressively copying data over to the new
column, before removing the old column and replacing it with the new
one.
A few changes to GitLab's MigrationHelpers were necessary:
* Some changes were made to remove dependencies on other GitLab code.
* We explicitly wait for index creation before forging ahead on column
replacements.
* We use different temporary column names, to avoid running into index
name length limits.
* We rename the generated indices back to what they "should" be after
replacing columns.
* We rename the generated foreign keys to use the new column names when
we had to create them. (This allows the migration to be rolled back
without incident.)
# Big Scary Warning
There are two things here that may trip up large instances:
1. The change for tables' "id" columns is not concurrent. In
particular, the stream_entries table may be big, and does not
concurrently migrate its id column. (On the other hand, x_id type
columns are all concurrent.)
2. This migration will take a long time to run, *but it should not
lock tables during that time* (with the exception of the "id"
columns as described above). That means this should probably be run
in `screen` or some other session that can be run for a long time.
Notably, the migration will take *longer* than it would without
these changes, but the website will still be responsive during that
time.
These changes were tested on a relatively large statuses table (256k
entries), and the service remained responsive during the migration.
Migrations both forward and backward were tested.
* Rubocop fixes
* MigrationHelpers: Support ID columns in some cases
This doesn't work in cases where the ID column is referred to as a
foreign key by another table.
* MigrationHelpers: support foreign keys for ID cols
Note that this does not yet support foreign keys on non-primary-key
columns, but Mastodon also doesn't yet have any that we've needed to
migrate.
This means we can perform fully "concurrent" migrations to change ID
column types, and the IdsToBigints migration can happen with effectively
no downtime. (A few operations require a transaction, such as renaming
columns or deleting them, but these transactions should not block for
noticeable amounts of time.)
The algorithm for generating foreign key names has changed with this,
and therefore all of those changed in schema.rb.
* Provide status, allow for interruptions
The MigrationHelpers now allow restarting the rename of a column if it
was interrupted, by removing the old "new column" and re-starting the
process.
Along with this, they now provide status updates on the changes which
are happening, as well as indications about when the changes can be
safely interrupted (when there are at least 10 seconds estimated to be
left before copying data is complete).
The IdsToBigints migration now also sorts the columns it migrates by
size, starting with the largest tables. This should provide
administrators a worst-case scenario estimate for the length of
migrations: each successive change will get faster, giving admins a
chance to abort early on if they need to run the migration later. The
idea is that this does not force them to try to time interruptions
between smaller migrations.
* Fix column sorting in IdsToBigints
Not a significant change, but it impacts the order of columns in the
database and db/schema.rb.
* Actually pause before IdsToBigints
2017-10-02 15:28:59 -04:00
|
|
|
require Rails.root.join('lib', 'mastodon', 'migration_helpers')
|
|
|
|
|
2017-09-22 07:20:04 -04:00
|
|
|
class IdsToBigints < ActiveRecord::Migration[5.1]
|
Make IdsToBigints (mostly!) non-blocking (#5088)
* Make IdsToBigints (mostly!) non-blocking
This pulls in GitLab's MigrationHelpers, which include code to make
column changes in ways that Postgres can do without locking. In general,
this involves creating a new column, adding an index and any foreign
keys as appropriate, adding a trigger to keep it populated alongside
the old column, and then progressively copying data over to the new
column, before removing the old column and replacing it with the new
one.
A few changes to GitLab's MigrationHelpers were necessary:
* Some changes were made to remove dependencies on other GitLab code.
* We explicitly wait for index creation before forging ahead on column
replacements.
* We use different temporary column names, to avoid running into index
name length limits.
* We rename the generated indices back to what they "should" be after
replacing columns.
* We rename the generated foreign keys to use the new column names when
we had to create them. (This allows the migration to be rolled back
without incident.)
# Big Scary Warning
There are two things here that may trip up large instances:
1. The change for tables' "id" columns is not concurrent. In
particular, the stream_entries table may be big, and does not
concurrently migrate its id column. (On the other hand, x_id type
columns are all concurrent.)
2. This migration will take a long time to run, *but it should not
lock tables during that time* (with the exception of the "id"
columns as described above). That means this should probably be run
in `screen` or some other session that can be run for a long time.
Notably, the migration will take *longer* than it would without
these changes, but the website will still be responsive during that
time.
These changes were tested on a relatively large statuses table (256k
entries), and the service remained responsive during the migration.
Migrations both forward and backward were tested.
* Rubocop fixes
* MigrationHelpers: Support ID columns in some cases
This doesn't work in cases where the ID column is referred to as a
foreign key by another table.
* MigrationHelpers: support foreign keys for ID cols
Note that this does not yet support foreign keys on non-primary-key
columns, but Mastodon also doesn't yet have any that we've needed to
migrate.
This means we can perform fully "concurrent" migrations to change ID
column types, and the IdsToBigints migration can happen with effectively
no downtime. (A few operations require a transaction, such as renaming
columns or deleting them, but these transactions should not block for
noticeable amounts of time.)
The algorithm for generating foreign key names has changed with this,
and therefore all of those changed in schema.rb.
* Provide status, allow for interruptions
The MigrationHelpers now allow restarting the rename of a column if it
was interrupted, by removing the old "new column" and re-starting the
process.
Along with this, they now provide status updates on the changes which
are happening, as well as indications about when the changes can be
safely interrupted (when there are at least 10 seconds estimated to be
left before copying data is complete).
The IdsToBigints migration now also sorts the columns it migrates by
size, starting with the largest tables. This should provide
administrators a worst-case scenario estimate for the length of
migrations: each successive change will get faster, giving admins a
chance to abort early on if they need to run the migration later. The
idea is that this does not force them to try to time interruptions
between smaller migrations.
* Fix column sorting in IdsToBigints
Not a significant change, but it impacts the order of columns in the
database and db/schema.rb.
* Actually pause before IdsToBigints
2017-10-02 15:28:59 -04:00
|
|
|
include Mastodon::MigrationHelpers
|
|
|
|
|
|
|
|
disable_ddl_transaction!
|
|
|
|
|
|
|
|
def migrate_columns(to_type)
|
2019-07-23 05:08:11 -04:00
|
|
|
included_columns = [
|
|
|
|
[:account_domain_blocks, :account_id],
|
|
|
|
[:account_domain_blocks, :id],
|
|
|
|
[:accounts, :id],
|
|
|
|
[:blocks, :account_id],
|
|
|
|
[:blocks, :id],
|
|
|
|
[:blocks, :target_account_id],
|
|
|
|
[:conversation_mutes, :account_id],
|
|
|
|
[:conversation_mutes, :id],
|
|
|
|
[:domain_blocks, :id],
|
|
|
|
[:favourites, :account_id],
|
|
|
|
[:favourites, :id],
|
|
|
|
[:favourites, :status_id],
|
|
|
|
[:follow_requests, :account_id],
|
|
|
|
[:follow_requests, :id],
|
|
|
|
[:follow_requests, :target_account_id],
|
|
|
|
[:follows, :account_id],
|
|
|
|
[:follows, :id],
|
|
|
|
[:follows, :target_account_id],
|
|
|
|
[:imports, :account_id],
|
|
|
|
[:imports, :id],
|
|
|
|
[:media_attachments, :account_id],
|
|
|
|
[:media_attachments, :id],
|
|
|
|
[:mentions, :account_id],
|
|
|
|
[:mentions, :id],
|
|
|
|
[:mutes, :account_id],
|
|
|
|
[:mutes, :id],
|
|
|
|
[:mutes, :target_account_id],
|
|
|
|
[:notifications, :account_id],
|
|
|
|
[:notifications, :from_account_id],
|
|
|
|
[:notifications, :id],
|
|
|
|
[:oauth_access_grants, :application_id],
|
|
|
|
[:oauth_access_grants, :id],
|
|
|
|
[:oauth_access_grants, :resource_owner_id],
|
|
|
|
[:oauth_access_tokens, :application_id],
|
|
|
|
[:oauth_access_tokens, :id],
|
|
|
|
[:oauth_access_tokens, :resource_owner_id],
|
|
|
|
[:oauth_applications, :id],
|
|
|
|
[:oauth_applications, :owner_id],
|
|
|
|
[:reports, :account_id],
|
|
|
|
[:reports, :action_taken_by_account_id],
|
|
|
|
[:reports, :id],
|
|
|
|
[:reports, :target_account_id],
|
|
|
|
[:session_activations, :access_token_id],
|
|
|
|
[:session_activations, :user_id],
|
|
|
|
[:session_activations, :web_push_subscription_id],
|
|
|
|
[:settings, :id],
|
|
|
|
[:settings, :thing_id],
|
|
|
|
[:statuses, :account_id],
|
|
|
|
[:statuses, :application_id],
|
|
|
|
[:statuses, :in_reply_to_account_id],
|
|
|
|
[:stream_entries, :account_id],
|
|
|
|
[:stream_entries, :id],
|
|
|
|
[:subscriptions, :account_id],
|
|
|
|
[:subscriptions, :id],
|
|
|
|
[:tags, :id],
|
|
|
|
[:users, :account_id],
|
|
|
|
[:users, :id],
|
|
|
|
[:web_settings, :id],
|
|
|
|
[:web_settings, :user_id],
|
|
|
|
]
|
|
|
|
included_columns << [:deprecated_preview_cards, :id] if table_exists?(:deprecated_preview_cards)
|
|
|
|
|
Make IdsToBigints (mostly!) non-blocking (#5088)
* Make IdsToBigints (mostly!) non-blocking
This pulls in GitLab's MigrationHelpers, which include code to make
column changes in ways that Postgres can do without locking. In general,
this involves creating a new column, adding an index and any foreign
keys as appropriate, adding a trigger to keep it populated alongside
the old column, and then progressively copying data over to the new
column, before removing the old column and replacing it with the new
one.
A few changes to GitLab's MigrationHelpers were necessary:
* Some changes were made to remove dependencies on other GitLab code.
* We explicitly wait for index creation before forging ahead on column
replacements.
* We use different temporary column names, to avoid running into index
name length limits.
* We rename the generated indices back to what they "should" be after
replacing columns.
* We rename the generated foreign keys to use the new column names when
we had to create them. (This allows the migration to be rolled back
without incident.)
# Big Scary Warning
There are two things here that may trip up large instances:
1. The change for tables' "id" columns is not concurrent. In
particular, the stream_entries table may be big, and does not
concurrently migrate its id column. (On the other hand, x_id type
columns are all concurrent.)
2. This migration will take a long time to run, *but it should not
lock tables during that time* (with the exception of the "id"
columns as described above). That means this should probably be run
in `screen` or some other session that can be run for a long time.
Notably, the migration will take *longer* than it would without
these changes, but the website will still be responsive during that
time.
These changes were tested on a relatively large statuses table (256k
entries), and the service remained responsive during the migration.
Migrations both forward and backward were tested.
* Rubocop fixes
* MigrationHelpers: Support ID columns in some cases
This doesn't work in cases where the ID column is referred to as a
foreign key by another table.
* MigrationHelpers: support foreign keys for ID cols
Note that this does not yet support foreign keys on non-primary-key
columns, but Mastodon also doesn't yet have any that we've needed to
migrate.
This means we can perform fully "concurrent" migrations to change ID
column types, and the IdsToBigints migration can happen with effectively
no downtime. (A few operations require a transaction, such as renaming
columns or deleting them, but these transactions should not block for
noticeable amounts of time.)
The algorithm for generating foreign key names has changed with this,
and therefore all of those changed in schema.rb.
* Provide status, allow for interruptions
The MigrationHelpers now allow restarting the rename of a column if it
was interrupted, by removing the old "new column" and re-starting the
process.
Along with this, they now provide status updates on the changes which
are happening, as well as indications about when the changes can be
safely interrupted (when there are at least 10 seconds estimated to be
left before copying data is complete).
The IdsToBigints migration now also sorts the columns it migrates by
size, starting with the largest tables. This should provide
administrators a worst-case scenario estimate for the length of
migrations: each successive change will get faster, giving admins a
chance to abort early on if they need to run the migration later. The
idea is that this does not force them to try to time interruptions
between smaller migrations.
* Fix column sorting in IdsToBigints
Not a significant change, but it impacts the order of columns in the
database and db/schema.rb.
* Actually pause before IdsToBigints
2017-10-02 15:28:59 -04:00
|
|
|
# Print out a warning that this will probably take a while.
|
2020-01-27 05:04:42 -05:00
|
|
|
if $stdout.isatty
|
|
|
|
say ''
|
|
|
|
say 'WARNING: This migration may take a *long* time for large instances'
|
|
|
|
say 'It will *not* lock tables for any significant time, but it may run'
|
|
|
|
say 'for a very long time. We will pause for 10 seconds to allow you to'
|
|
|
|
say 'interrupt this migration if you are not ready.'
|
|
|
|
say ''
|
|
|
|
say 'This migration has some sections that can be safely interrupted'
|
|
|
|
say 'and restarted later, and will tell you when those are occurring.'
|
|
|
|
say ''
|
2022-10-09 18:33:38 -04:00
|
|
|
say 'For more information, see https://github.com/mastodon/mastodon/pull/5088'
|
Make IdsToBigints (mostly!) non-blocking (#5088)
* Make IdsToBigints (mostly!) non-blocking
This pulls in GitLab's MigrationHelpers, which include code to make
column changes in ways that Postgres can do without locking. In general,
this involves creating a new column, adding an index and any foreign
keys as appropriate, adding a trigger to keep it populated alongside
the old column, and then progressively copying data over to the new
column, before removing the old column and replacing it with the new
one.
A few changes to GitLab's MigrationHelpers were necessary:
* Some changes were made to remove dependencies on other GitLab code.
* We explicitly wait for index creation before forging ahead on column
replacements.
* We use different temporary column names, to avoid running into index
name length limits.
* We rename the generated indices back to what they "should" be after
replacing columns.
* We rename the generated foreign keys to use the new column names when
we had to create them. (This allows the migration to be rolled back
without incident.)
# Big Scary Warning
There are two things here that may trip up large instances:
1. The change for tables' "id" columns is not concurrent. In
particular, the stream_entries table may be big, and does not
concurrently migrate its id column. (On the other hand, x_id type
columns are all concurrent.)
2. This migration will take a long time to run, *but it should not
lock tables during that time* (with the exception of the "id"
columns as described above). That means this should probably be run
in `screen` or some other session that can be run for a long time.
Notably, the migration will take *longer* than it would without
these changes, but the website will still be responsive during that
time.
These changes were tested on a relatively large statuses table (256k
entries), and the service remained responsive during the migration.
Migrations both forward and backward were tested.
* Rubocop fixes
* MigrationHelpers: Support ID columns in some cases
This doesn't work in cases where the ID column is referred to as a
foreign key by another table.
* MigrationHelpers: support foreign keys for ID cols
Note that this does not yet support foreign keys on non-primary-key
columns, but Mastodon also doesn't yet have any that we've needed to
migrate.
This means we can perform fully "concurrent" migrations to change ID
column types, and the IdsToBigints migration can happen with effectively
no downtime. (A few operations require a transaction, such as renaming
columns or deleting them, but these transactions should not block for
noticeable amounts of time.)
The algorithm for generating foreign key names has changed with this,
and therefore all of those changed in schema.rb.
* Provide status, allow for interruptions
The MigrationHelpers now allow restarting the rename of a column if it
was interrupted, by removing the old "new column" and re-starting the
process.
Along with this, they now provide status updates on the changes which
are happening, as well as indications about when the changes can be
safely interrupted (when there are at least 10 seconds estimated to be
left before copying data is complete).
The IdsToBigints migration now also sorts the columns it migrates by
size, starting with the largest tables. This should provide
administrators a worst-case scenario estimate for the length of
migrations: each successive change will get faster, giving admins a
chance to abort early on if they need to run the migration later. The
idea is that this does not force them to try to time interruptions
between smaller migrations.
* Fix column sorting in IdsToBigints
Not a significant change, but it impacts the order of columns in the
database and db/schema.rb.
* Actually pause before IdsToBigints
2017-10-02 15:28:59 -04:00
|
|
|
|
2020-01-27 05:04:42 -05:00
|
|
|
10.downto(1) do |i|
|
|
|
|
say "Continuing in #{i} second#{i == 1 ? '' : 's'}...", true
|
|
|
|
sleep 1
|
|
|
|
end
|
Make IdsToBigints (mostly!) non-blocking (#5088)
* Make IdsToBigints (mostly!) non-blocking
This pulls in GitLab's MigrationHelpers, which include code to make
column changes in ways that Postgres can do without locking. In general,
this involves creating a new column, adding an index and any foreign
keys as appropriate, adding a trigger to keep it populated alongside
the old column, and then progressively copying data over to the new
column, before removing the old column and replacing it with the new
one.
A few changes to GitLab's MigrationHelpers were necessary:
* Some changes were made to remove dependencies on other GitLab code.
* We explicitly wait for index creation before forging ahead on column
replacements.
* We use different temporary column names, to avoid running into index
name length limits.
* We rename the generated indices back to what they "should" be after
replacing columns.
* We rename the generated foreign keys to use the new column names when
we had to create them. (This allows the migration to be rolled back
without incident.)
# Big Scary Warning
There are two things here that may trip up large instances:
1. The change for tables' "id" columns is not concurrent. In
particular, the stream_entries table may be big, and does not
concurrently migrate its id column. (On the other hand, x_id type
columns are all concurrent.)
2. This migration will take a long time to run, *but it should not
lock tables during that time* (with the exception of the "id"
columns as described above). That means this should probably be run
in `screen` or some other session that can be run for a long time.
Notably, the migration will take *longer* than it would without
these changes, but the website will still be responsive during that
time.
These changes were tested on a relatively large statuses table (256k
entries), and the service remained responsive during the migration.
Migrations both forward and backward were tested.
* Rubocop fixes
* MigrationHelpers: Support ID columns in some cases
This doesn't work in cases where the ID column is referred to as a
foreign key by another table.
* MigrationHelpers: support foreign keys for ID cols
Note that this does not yet support foreign keys on non-primary-key
columns, but Mastodon also doesn't yet have any that we've needed to
migrate.
This means we can perform fully "concurrent" migrations to change ID
column types, and the IdsToBigints migration can happen with effectively
no downtime. (A few operations require a transaction, such as renaming
columns or deleting them, but these transactions should not block for
noticeable amounts of time.)
The algorithm for generating foreign key names has changed with this,
and therefore all of those changed in schema.rb.
* Provide status, allow for interruptions
The MigrationHelpers now allow restarting the rename of a column if it
was interrupted, by removing the old "new column" and re-starting the
process.
Along with this, they now provide status updates on the changes which
are happening, as well as indications about when the changes can be
safely interrupted (when there are at least 10 seconds estimated to be
left before copying data is complete).
The IdsToBigints migration now also sorts the columns it migrates by
size, starting with the largest tables. This should provide
administrators a worst-case scenario estimate for the length of
migrations: each successive change will get faster, giving admins a
chance to abort early on if they need to run the migration later. The
idea is that this does not force them to try to time interruptions
between smaller migrations.
* Fix column sorting in IdsToBigints
Not a significant change, but it impacts the order of columns in the
database and db/schema.rb.
* Actually pause before IdsToBigints
2017-10-02 15:28:59 -04:00
|
|
|
end
|
|
|
|
|
2019-07-23 05:08:11 -04:00
|
|
|
tables = included_columns.map(&:first).uniq
|
Make IdsToBigints (mostly!) non-blocking (#5088)
* Make IdsToBigints (mostly!) non-blocking
This pulls in GitLab's MigrationHelpers, which include code to make
column changes in ways that Postgres can do without locking. In general,
this involves creating a new column, adding an index and any foreign
keys as appropriate, adding a trigger to keep it populated alongside
the old column, and then progressively copying data over to the new
column, before removing the old column and replacing it with the new
one.
A few changes to GitLab's MigrationHelpers were necessary:
* Some changes were made to remove dependencies on other GitLab code.
* We explicitly wait for index creation before forging ahead on column
replacements.
* We use different temporary column names, to avoid running into index
name length limits.
* We rename the generated indices back to what they "should" be after
replacing columns.
* We rename the generated foreign keys to use the new column names when
we had to create them. (This allows the migration to be rolled back
without incident.)
# Big Scary Warning
There are two things here that may trip up large instances:
1. The change for tables' "id" columns is not concurrent. In
particular, the stream_entries table may be big, and does not
concurrently migrate its id column. (On the other hand, x_id type
columns are all concurrent.)
2. This migration will take a long time to run, *but it should not
lock tables during that time* (with the exception of the "id"
columns as described above). That means this should probably be run
in `screen` or some other session that can be run for a long time.
Notably, the migration will take *longer* than it would without
these changes, but the website will still be responsive during that
time.
These changes were tested on a relatively large statuses table (256k
entries), and the service remained responsive during the migration.
Migrations both forward and backward were tested.
* Rubocop fixes
* MigrationHelpers: Support ID columns in some cases
This doesn't work in cases where the ID column is referred to as a
foreign key by another table.
* MigrationHelpers: support foreign keys for ID cols
Note that this does not yet support foreign keys on non-primary-key
columns, but Mastodon also doesn't yet have any that we've needed to
migrate.
This means we can perform fully "concurrent" migrations to change ID
column types, and the IdsToBigints migration can happen with effectively
no downtime. (A few operations require a transaction, such as renaming
columns or deleting them, but these transactions should not block for
noticeable amounts of time.)
The algorithm for generating foreign key names has changed with this,
and therefore all of those changed in schema.rb.
* Provide status, allow for interruptions
The MigrationHelpers now allow restarting the rename of a column if it
was interrupted, by removing the old "new column" and re-starting the
process.
Along with this, they now provide status updates on the changes which
are happening, as well as indications about when the changes can be
safely interrupted (when there are at least 10 seconds estimated to be
left before copying data is complete).
The IdsToBigints migration now also sorts the columns it migrates by
size, starting with the largest tables. This should provide
administrators a worst-case scenario estimate for the length of
migrations: each successive change will get faster, giving admins a
chance to abort early on if they need to run the migration later. The
idea is that this does not force them to try to time interruptions
between smaller migrations.
* Fix column sorting in IdsToBigints
Not a significant change, but it impacts the order of columns in the
database and db/schema.rb.
* Actually pause before IdsToBigints
2017-10-02 15:28:59 -04:00
|
|
|
table_sizes = {}
|
|
|
|
|
|
|
|
# Sort tables by their size
|
|
|
|
tables.each do |table|
|
|
|
|
table_sizes[table] = estimate_rows_in_table(table)
|
|
|
|
end
|
|
|
|
|
2019-07-23 05:08:11 -04:00
|
|
|
ordered_columns = included_columns.sort_by do |col_parts|
|
Make IdsToBigints (mostly!) non-blocking (#5088)
* Make IdsToBigints (mostly!) non-blocking
This pulls in GitLab's MigrationHelpers, which include code to make
column changes in ways that Postgres can do without locking. In general,
this involves creating a new column, adding an index and any foreign
keys as appropriate, adding a trigger to keep it populated alongside
the old column, and then progressively copying data over to the new
column, before removing the old column and replacing it with the new
one.
A few changes to GitLab's MigrationHelpers were necessary:
* Some changes were made to remove dependencies on other GitLab code.
* We explicitly wait for index creation before forging ahead on column
replacements.
* We use different temporary column names, to avoid running into index
name length limits.
* We rename the generated indices back to what they "should" be after
replacing columns.
* We rename the generated foreign keys to use the new column names when
we had to create them. (This allows the migration to be rolled back
without incident.)
# Big Scary Warning
There are two things here that may trip up large instances:
1. The change for tables' "id" columns is not concurrent. In
particular, the stream_entries table may be big, and does not
concurrently migrate its id column. (On the other hand, x_id type
columns are all concurrent.)
2. This migration will take a long time to run, *but it should not
lock tables during that time* (with the exception of the "id"
columns as described above). That means this should probably be run
in `screen` or some other session that can be run for a long time.
Notably, the migration will take *longer* than it would without
these changes, but the website will still be responsive during that
time.
These changes were tested on a relatively large statuses table (256k
entries), and the service remained responsive during the migration.
Migrations both forward and backward were tested.
* Rubocop fixes
* MigrationHelpers: Support ID columns in some cases
This doesn't work in cases where the ID column is referred to as a
foreign key by another table.
* MigrationHelpers: support foreign keys for ID cols
Note that this does not yet support foreign keys on non-primary-key
columns, but Mastodon also doesn't yet have any that we've needed to
migrate.
This means we can perform fully "concurrent" migrations to change ID
column types, and the IdsToBigints migration can happen with effectively
no downtime. (A few operations require a transaction, such as renaming
columns or deleting them, but these transactions should not block for
noticeable amounts of time.)
The algorithm for generating foreign key names has changed with this,
and therefore all of those changed in schema.rb.
* Provide status, allow for interruptions
The MigrationHelpers now allow restarting the rename of a column if it
was interrupted, by removing the old "new column" and re-starting the
process.
Along with this, they now provide status updates on the changes which
are happening, as well as indications about when the changes can be
safely interrupted (when there are at least 10 seconds estimated to be
left before copying data is complete).
The IdsToBigints migration now also sorts the columns it migrates by
size, starting with the largest tables. This should provide
administrators a worst-case scenario estimate for the length of
migrations: each successive change will get faster, giving admins a
chance to abort early on if they need to run the migration later. The
idea is that this does not force them to try to time interruptions
between smaller migrations.
* Fix column sorting in IdsToBigints
Not a significant change, but it impacts the order of columns in the
database and db/schema.rb.
* Actually pause before IdsToBigints
2017-10-02 15:28:59 -04:00
|
|
|
[-table_sizes[col_parts.first], col_parts.last]
|
|
|
|
end
|
|
|
|
|
|
|
|
ordered_columns.each do |column_parts|
|
|
|
|
table, column = column_parts
|
|
|
|
|
|
|
|
# Skip this if we're resuming and already did this one.
|
|
|
|
next if column_for(table, column).sql_type == to_type.to_s
|
|
|
|
|
|
|
|
change_column_type_concurrently table, column, to_type
|
|
|
|
cleanup_concurrent_column_type_change table, column
|
|
|
|
end
|
|
|
|
end
|
|
|
|
|
2017-09-22 07:20:04 -04:00
|
|
|
def up
|
Make IdsToBigints (mostly!) non-blocking (#5088)
* Make IdsToBigints (mostly!) non-blocking
This pulls in GitLab's MigrationHelpers, which include code to make
column changes in ways that Postgres can do without locking. In general,
this involves creating a new column, adding an index and any foreign
keys as appropriate, adding a trigger to keep it populated alongside
the old column, and then progressively copying data over to the new
column, before removing the old column and replacing it with the new
one.
A few changes to GitLab's MigrationHelpers were necessary:
* Some changes were made to remove dependencies on other GitLab code.
* We explicitly wait for index creation before forging ahead on column
replacements.
* We use different temporary column names, to avoid running into index
name length limits.
* We rename the generated indices back to what they "should" be after
replacing columns.
* We rename the generated foreign keys to use the new column names when
we had to create them. (This allows the migration to be rolled back
without incident.)
# Big Scary Warning
There are two things here that may trip up large instances:
1. The change for tables' "id" columns is not concurrent. In
particular, the stream_entries table may be big, and does not
concurrently migrate its id column. (On the other hand, x_id type
columns are all concurrent.)
2. This migration will take a long time to run, *but it should not
lock tables during that time* (with the exception of the "id"
columns as described above). That means this should probably be run
in `screen` or some other session that can be run for a long time.
Notably, the migration will take *longer* than it would without
these changes, but the website will still be responsive during that
time.
These changes were tested on a relatively large statuses table (256k
entries), and the service remained responsive during the migration.
Migrations both forward and backward were tested.
* Rubocop fixes
* MigrationHelpers: Support ID columns in some cases
This doesn't work in cases where the ID column is referred to as a
foreign key by another table.
* MigrationHelpers: support foreign keys for ID cols
Note that this does not yet support foreign keys on non-primary-key
columns, but Mastodon also doesn't yet have any that we've needed to
migrate.
This means we can perform fully "concurrent" migrations to change ID
column types, and the IdsToBigints migration can happen with effectively
no downtime. (A few operations require a transaction, such as renaming
columns or deleting them, but these transactions should not block for
noticeable amounts of time.)
The algorithm for generating foreign key names has changed with this,
and therefore all of those changed in schema.rb.
* Provide status, allow for interruptions
The MigrationHelpers now allow restarting the rename of a column if it
was interrupted, by removing the old "new column" and re-starting the
process.
Along with this, they now provide status updates on the changes which
are happening, as well as indications about when the changes can be
safely interrupted (when there are at least 10 seconds estimated to be
left before copying data is complete).
The IdsToBigints migration now also sorts the columns it migrates by
size, starting with the largest tables. This should provide
administrators a worst-case scenario estimate for the length of
migrations: each successive change will get faster, giving admins a
chance to abort early on if they need to run the migration later. The
idea is that this does not force them to try to time interruptions
between smaller migrations.
* Fix column sorting in IdsToBigints
Not a significant change, but it impacts the order of columns in the
database and db/schema.rb.
* Actually pause before IdsToBigints
2017-10-02 15:28:59 -04:00
|
|
|
migrate_columns(:bigint)
|
2017-09-22 07:20:04 -04:00
|
|
|
end
|
|
|
|
|
|
|
|
def down
|
Make IdsToBigints (mostly!) non-blocking (#5088)
* Make IdsToBigints (mostly!) non-blocking
This pulls in GitLab's MigrationHelpers, which include code to make
column changes in ways that Postgres can do without locking. In general,
this involves creating a new column, adding an index and any foreign
keys as appropriate, adding a trigger to keep it populated alongside
the old column, and then progressively copying data over to the new
column, before removing the old column and replacing it with the new
one.
A few changes to GitLab's MigrationHelpers were necessary:
* Some changes were made to remove dependencies on other GitLab code.
* We explicitly wait for index creation before forging ahead on column
replacements.
* We use different temporary column names, to avoid running into index
name length limits.
* We rename the generated indices back to what they "should" be after
replacing columns.
* We rename the generated foreign keys to use the new column names when
we had to create them. (This allows the migration to be rolled back
without incident.)
# Big Scary Warning
There are two things here that may trip up large instances:
1. The change for tables' "id" columns is not concurrent. In
particular, the stream_entries table may be big, and does not
concurrently migrate its id column. (On the other hand, x_id type
columns are all concurrent.)
2. This migration will take a long time to run, *but it should not
lock tables during that time* (with the exception of the "id"
columns as described above). That means this should probably be run
in `screen` or some other session that can be run for a long time.
Notably, the migration will take *longer* than it would without
these changes, but the website will still be responsive during that
time.
These changes were tested on a relatively large statuses table (256k
entries), and the service remained responsive during the migration.
Migrations both forward and backward were tested.
* Rubocop fixes
* MigrationHelpers: Support ID columns in some cases
This doesn't work in cases where the ID column is referred to as a
foreign key by another table.
* MigrationHelpers: support foreign keys for ID cols
Note that this does not yet support foreign keys on non-primary-key
columns, but Mastodon also doesn't yet have any that we've needed to
migrate.
This means we can perform fully "concurrent" migrations to change ID
column types, and the IdsToBigints migration can happen with effectively
no downtime. (A few operations require a transaction, such as renaming
columns or deleting them, but these transactions should not block for
noticeable amounts of time.)
The algorithm for generating foreign key names has changed with this,
and therefore all of those changed in schema.rb.
* Provide status, allow for interruptions
The MigrationHelpers now allow restarting the rename of a column if it
was interrupted, by removing the old "new column" and re-starting the
process.
Along with this, they now provide status updates on the changes which
are happening, as well as indications about when the changes can be
safely interrupted (when there are at least 10 seconds estimated to be
left before copying data is complete).
The IdsToBigints migration now also sorts the columns it migrates by
size, starting with the largest tables. This should provide
administrators a worst-case scenario estimate for the length of
migrations: each successive change will get faster, giving admins a
chance to abort early on if they need to run the migration later. The
idea is that this does not force them to try to time interruptions
between smaller migrations.
* Fix column sorting in IdsToBigints
Not a significant change, but it impacts the order of columns in the
database and db/schema.rb.
* Actually pause before IdsToBigints
2017-10-02 15:28:59 -04:00
|
|
|
migrate_columns(:integer)
|
2017-09-22 07:20:04 -04:00
|
|
|
end
|
|
|
|
end
|