Feedback requested: gh-ost on Galera / XtraDB cluster? #224

shlomi-noach · 2016-09-02T11:18:49Z

Dear community,
At this time we are not using Galera/XtraDB Cluster. We haven't tested gh-ost on Galera. We're curious to hear:

If anyone has
If there are specific blockers or problems
If gh-ost on Galera makes sense, given you can use Rolling Schema Upgrades.

Thank you!

The text was updated successfully, but these errors were encountered:

Roguelazer · 2016-09-04T17:47:00Z

That's actually on my list to test next week. It seems like the easiest way to do it would be to hang an asynchronous replica off of the cluster (which is a supported operation) and run gh-ost there to minimize load on the synchronous cluster.

gh-ost definitely makes sense; RSU is a huge pain in the butt since you are de-syncing the node (and need to stay in gcache to prevent undoing the migration and forcing a SST).

I'll let you know how the testing in our dev cluster works.

SuperQ · 2016-09-05T08:28:23Z

We use Galera/XtraDB and would definitely like to use gh-ost. We also have async slaves replicating from Galera "master cluster"s to provide additional read capacity and analytics access. Therefore it does make sense to continue to use an online schema tool.

shlomi-noach · 2016-09-05T09:05:48Z

Thank both! Do you foresee a particular issue with the final lock/rename steps?

SuperQ · 2016-09-05T13:31:05Z

This applies to all writes that gh-ost does, not just the final step.

One thing that happens with Galera/XtraDB is the potential to deadlock "frequently". In theory this can happen between threads on a normal mysql master, but it's very rare.

This is caused by the rollup of multiple transactions into a single locking operation between the multiple galera masters. The default for this is 16 transactions, but is tunable0. I personally don't recommend changing this, but some people do.

gh-ost needs to be able to receive these deadlock rejected queries and retry them. This is safe, but slightly tedious. When dealing with distributed multi-master systems, this kind of problem will become more common. :)

shlomi-noach · 2016-09-05T13:56:12Z

@SuperQ gh-ost already has retries, which are also configurable via -default-retries

SuperQ · 2016-09-05T14:18:04Z

👍 💯 🎱

Frank-Leemkuil · 2016-10-18T11:49:08Z

@shlomi-noach We test gh-ost on a Galera staging cluster and get a lot of deadlocks it looks like it never stops:

./gh-ost
--max-load=Threads_running=25
--critical-load=Threads_running=1000
--chunk-size=1000
--alter="MODIFY COLUMN EventId bigint SIGNED NOT NULL"
--max-lag-millis=1500
--user="username"
--password="password"
--database="databasename"
--table="alarm"
--host="ip-number"
--assume-master-host="ip-number"
--discard-foreign-keys
--allow-on-master
--allow-master-master
--switch-to-rbr
--cut-over="default"
--exact-rowcount
--concurrent-rowcount
--default-retries=120
--panic-flag-file="/tmp/ghost.panic.flag"
--postpone-cut-over-flag-file="/tmp/ghost.postpone.flag" \

Migrating db.alarm; Ghost table is db._alarm_gho
Migrating hostname:3306; inspecting hostname:3306; executing on hostname
Migration started at Tue Oct 18 13:40:26 +0200 2016
chunk-size: 1000; max-lag-millis: 1500ms; max-load: Threads_running=25; critical-load: Threads_running=1000; nice-ratio: 0.000000
throttle-additional-flag-file: /tmp/gh-ost.throttle
postpone-cut-over-flag-file: /tmp/ghost.postpone.flag
panic-flag-file: /tmp/ghost.panic.flag
Serving on unix socket: /tmp/gh-ost.db.alarm.sock
Copy: 102911/102911 100.0%; Applied: 0; Backlog: 0/100; Time: 2m14s(total), 13s(copy); streamer: mysql-bin.000001:47131128; State: migrating; ETA: due
2016-10-18 13:42:41 ERROR Error 1050: Table '_alarm_del' already exists
2016-10-18 13:42:41 ERROR Error 1213: Deadlock found when trying to get lock; try restarting transaction
Copy: 102911/102911 100.0%; Applied: 0; Backlog: 0/100; Time: 2m15s(total), 13s(copy); streamer: mysql-bin.000001:47131834; State: migrating; ETA: due
2016-10-18 13:42:42 ERROR Session lock gh-ost.506.lock expected to be found but wasn't
Migrating db.alarm; Ghost table is db._alarm_gho
Migrating hostname:3306; inspecting hostname:3306; executing on hostname
Migration started at Tue Oct 18 13:40:26 +0200 2016
chunk-size: 1000; max-lag-millis: 1500ms; max-load: Threads_running=25; critical-load: Threads_running=1000; nice-ratio: 0.000000
throttle-additional-flag-file: /tmp/gh-ost.throttle
postpone-cut-over-flag-file: /tmp/ghost.postpone.flag
panic-flag-file: /tmp/ghost.panic.flag
Serving on unix socket: /tmp/gh-ost.db.alarm.sock
Copy: 102911/102911 100.0%; Applied: 0; Backlog: 0/100; Time: 2m17s(total), 13s(copy); streamer: mysql-bin.000001:47134853; State: migrating; ETA: due
2016-10-18 13:42:44 ERROR Error 1213: Deadlock found when trying to get lock; try restarting transaction
2016-10-18 13:42:44 ERROR Error 1050: Table '_alarm_del' already exists
2016-10-18 13:42:44 ERROR Error 1050: Table '_alarm_del' already exists

The staging environment is not a slave of production and has not to deal with live data.

shlomi-noach · 2016-10-18T12:49:23Z

@Frank-Leemkuil a few questions:

as per Table '_alarm_del' already exists and Table '_alarm_del' already exists -- did those tables pre-exist before migration?
Session lock gh-ost.506.lock expected to be found but wasn't baffles me at this time.
What do you mean by "it looks like it never stops"? Please clarify
What happens after given log output?
What happens before Copy: 102911/102911 100.0% ?

Frank-Leemkuil · 2016-10-18T12:56:19Z

The tables are don't pre-exist before migration
The deadlock is retry every time
I have dropped the session, because no progress.
When I test the command without execute it works fine.

shlomi-noach · 2016-10-18T13:11:39Z

"The deadlock is retry every time" so this repeats again and again, up to --default-retries (default 60) -- and than panic and exits?
"I have dropped the session, because no progress.". I'm sorry, I do not follow. Can you please tell the story and explain this? I don't have your screen in front of me.
"When I test the command without execute it works fine." - makes sense, since your problem is in the cut-over phase.

Roguelazer · 2016-10-18T17:03:29Z

Since named locks and FLUSH TABLE x WITH READ LOCK don't work in Galera, it seems reasonable that you got errors?

shlomi-noach · 2016-10-18T17:11:43Z

Neither work in Galera? Does LOCK TABLES sometable WRITE work in Galera?

Roguelazer · 2016-10-18T18:18:06Z

Let me clarify (and apologize for not posting this last week):

GET_LOCK() works, but is not replicated. It's possible to get the same named lock from multiple hosts.
FLUSH TABLE ... WITH READ LOCK and LOCK TABLES ... WRITE both also work, but are not replicated.

When you issue FLUSH TABLE ... WITH READ LOCK it actually entirely pauses the wsrep provider (which implements replication in Galera). This can lead to some bizarre behaviors. For example, given a database containing two empty tables (bar and baz) with only auto-incrementing primary key columns:

(on node 1): FLUSH TABLE bar WITH READ LOCK
(on node 1): INSERT INTO bar VALUES() -> returns ERROR 1099 (HY000): Table 'bar' was locked with a READ lock and can't be updated
(on node 1): INSERT INTO baz VALUES() -> returns ERROR 1100 (HY000): Table 'baz' was not locked with LOCK TABLES
(on node 1): RENAME TABLE baz TO bing -> returns ERROR 1223 (HY000): Can't execute the query because you have a conflicting read lock
write locks are similarly confusing:

(all on node 1)

LOCK TABLE bar WRITE
RENAME TABLE baz TO bing -> session hangs forever and must be killed
RENAME TABLE bar TO bing -> session hangs forever and must be killed

After killing a session that had a lock, running UNLOCK TABLES seems to leave the server in a weird state where the table isn't locked, but it can't be written to (directly or by replication) and it can't be locked again (either by FLUSH TABLE ... or by LOCK TABLES ...). The only way to recover from this state (that I've found so far) is to kill -9 mysqld.

There may be some way to safely do this with Galera, but I have not yet found it.

All tests were executed on Percona XtraDB Cluster 5.6.30.

shlomi-noach · 2016-10-19T07:21:12Z

@Roguelazer thank you so much for elaborating.

On gh-ost's part, it is not expecting to replicate advisory locks nor table locks; it will expect to set and see the lock on a particular server. Having said that, a multi-write Galera setup would be a problem, as other writable masters will not honor the lock.

Your comment also presents a bigger problem: a lock on the migrated table will lock other tables as well. That'll easily cause a complete stall.

Last, @Frank-Leemkuil 's output shows "Session lock gh-ost.506.lock expected to be found but wasn't", meaning gh-ost was not even able to acquire the session lock (used internally to prove that the session is alive), which is confusing.

I do not intend to work on any of these in the immediate future; Galera is not on our priority list, and personally I do not have production experience with Galera.

If anyone reading this wants to give this a try please let me know.

taylorchu · 2016-11-15T23:18:29Z

I used docker-compose to test a minimal 3-node percona xtradb cluster. The main blocker is the default cut-over algorithm, as described before, but--cut-over two-step works.

timglabisch · 2017-09-15T21:49:13Z

doesnt work for me

when i've

mysqld --version
mysqld  Ver 5.7.18-15-57 for debian-linux-gnu on x86_64 (Percona XtraDB Cluster (GPL), Release rel15, Revision 7693d6e, WSREP version 29.20, wsrep_29.20)

and i run:

./gh-ost --cut-over two-step --host=192.168.10.99 --user=pxcuser --password=pxcpass --database=foo --table=documents --alter="ADD COLUMN x INT NOT NULL DEFAULT '0'" --chunk-size=2000 --max-load=Threads_connected=20 --allow-on-master --initially-drop-ghost-table --execute

i'll get

Copy: 0/1 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: mysql-bin.000001:291700; State: migrating; ETA: N/A
Copy: 0/1 0.0%; Applied: 0; Backlog: 0/1000; Time: 1s(total), 1s(copy); streamer: mysql-bin.000001:294236; State: migrating; ETA: N/A
Copy: 2/2 100.0%; Applied: 0; Backlog: 0/1000; Time: 1s(total), 1s(copy); streamer: mysql-bin.000001:295917; State: migrating; ETA: due
2017-09-15 23:14:19 ERROR Error 1105: Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK with pxc_strict_mode = ENFORCING
Copy: 2/2 100.0%; Applied: 0; Backlog: 0/1000; Time: 2s(total), 1s(copy); streamer: mysql-bin.000001:298453; State: migrating; ETA: due
2017-09-15 23:14:20 ERROR Error 1105: Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK with pxc_strict_mode = ENFORCING
Copy: 2/2 100.0%; Applied: 0; Backlog: 0/1000; Time: 3s(total), 1s(copy); streamer: mysql-bin.000001:301402; State: migrating; ETA: due
2017-09-15 23:14:21 ERROR Error 1105: Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK with pxc_strict_mode = ENFORCING
Copy: 2/2 100.0%; Applied: 0; Backlog: 0/1000; Time: 4s(total), 1s(copy); streamer: mysql-bin.000001:304353; State: migrating; ETA: due
2017-09-15 23:14:22 ERROR Error 1105: Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK with pxc_strict_mode = ENFORCING
Copy: 2/2 100.0%; Applied: 0; Backlog: 0/1000; Time: 5s(total), 1s(copy); streamer: mysql-bin.000001:307304; State: migrating; ETA: due
2017-09-15 23:14:23 ERROR Error 1105: Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK with pxc_strict_mode = ENFORCING
Copy: 2/2 100.0%; Applied: 0; Backlog: 0/1000; Time: 6s(total), 1s(copy); streamer: mysql-bin.000001:310255; State: migrating; ETA: due
2017-09-15 23:14:24 ERROR Error 1105: Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK with pxc_strict_mode = ENFORCING
Copy: 2/2 100.0%; Applied: 0; Backlog: 0/1000; Time: 7s(total), 1s(copy); streamer: mysql-bin.000001:313617; State: migrating; ETA: due
2017-09-15 23:14:25 ERROR Error 1105: Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK with pxc_strict_mode = ENFORCING
Copy: 2/2 100.0%; Applied: 0; Backlog: 0/1000; Time: 8s(total), 1s(copy); streamer: mysql-bin.000001:316979; State: migrating; ETA: due
2017-09-15 23:14:26 ERROR Error 1105: Percona-XtraDB-Cluster prohibits use of LOCK TABLE/FLUSH TABLE <table> WITH READ LOCK with pxc_strict_mode = ENFORCING
....

shlomi-noach · 2017-09-17T05:58:10Z

@timglabisch for sake of clarity, can you specify what doesn't work for you? Did you mean using --cut-over two-step doesn't work?

timglabisch · 2017-09-17T10:10:55Z

@shlomi-noach sorry for confusion, i updated my comment

Roguelazer · 2017-09-27T20:52:05Z

So, this works okay-ish on 5.6.

On 5.7, the explicit LOCK TABLE is forbidden (which is used even during the two-step process). It's not immediately clear to me why we need a LOCK TABLE at all in the two-step process, since the DDL change to rename the old table out of the way should itself acquire a write lock on the table. What would you think about adding a --skip-table-lock flag or somsuch which was only valid when in two-step cutover mode and which just didn't do the LOCK TABLE?

beanbee · 2018-08-05T13:41:50Z

gh-ost works on our three-nodes PXC (Percona XtraDB Cluster 5.6.40 with several asynchronous slaves), prerequisites:

PXC version 5.6, theoretically 5.7 with pxc_strict_mode in [PERMISSIVE, DISABLE] should work the same
with multi-master setups, we only have one PXC node to handle all the WRITE requests
two-step instead of default cut-over

LOCK/UNLOCK TABLES can only be executed on one PXC node (5.6 or 5.7 with pxc_strict_mode != ‘ENFORCING’), so if using one node for write requests, the data consistency will be guaranteed by LOCK TABLE.

As described before, the way PXC manages DDL statements(Total Order Isolation) would unintentionally cause gh-ost to hang at DROP TABLE tbl_old step. This is a 'deadlock' happens on the Galera level which can't be solved itself except a mysqld restart, but two-step is the safe option if the potential loss of cut-over is tolerable.

shlomi-noach added help wanted question labels Sep 2, 2016

shlomi-noach added the feedback label Sep 6, 2016

shlomi-noach mentioned this issue Oct 26, 2017

StreamEvents encountered unexpected error: data len 1259 < expected 14985 #509

Closed

shlomi-noach mentioned this issue Apr 16, 2018

ERROR Unable to acquire lock gh-ost.3726786.lock #569

Open

github deleted a comment from sierra2019 Apr 18, 2019

shlomi-noach mentioned this issue May 12, 2019

Did it supportted xtraDB #735

Closed

Gonlo2 mentioned this issue Sep 10, 2019

Allow use gh-ost on Galera / XtraDB cluster #780

Open

This was referenced May 25, 2021

gh-ost swap table is hanging cluster #974

Open

Getting error: MDL conflict db=test table=_table01_del ticket=9 solved by abort #975

Closed

alonpeer mentioned this issue Sep 13, 2021

Allow use gh-ost on Galera / XtraDB cluster - rebased #1024

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feedback requested: gh-ost on Galera / XtraDB cluster? #224

Feedback requested: gh-ost on Galera / XtraDB cluster? #224

shlomi-noach commented Sep 2, 2016

Roguelazer commented Sep 4, 2016

SuperQ commented Sep 5, 2016

shlomi-noach commented Sep 5, 2016

SuperQ commented Sep 5, 2016

shlomi-noach commented Sep 5, 2016

SuperQ commented Sep 5, 2016

Frank-Leemkuil commented Oct 18, 2016 •

edited

Loading

shlomi-noach commented Oct 18, 2016

Frank-Leemkuil commented Oct 18, 2016

shlomi-noach commented Oct 18, 2016

Roguelazer commented Oct 18, 2016

shlomi-noach commented Oct 18, 2016 •

edited

Loading

Roguelazer commented Oct 18, 2016 •

edited

Loading

shlomi-noach commented Oct 19, 2016

taylorchu commented Nov 15, 2016

timglabisch commented Sep 15, 2017 •

edited

Loading

shlomi-noach commented Sep 17, 2017

timglabisch commented Sep 17, 2017

Roguelazer commented Sep 27, 2017

beanbee commented Aug 5, 2018

Feedback requested: gh-ost on Galera / XtraDB cluster? #224

Feedback requested: gh-ost on Galera / XtraDB cluster? #224

Comments

shlomi-noach commented Sep 2, 2016

Roguelazer commented Sep 4, 2016

SuperQ commented Sep 5, 2016

shlomi-noach commented Sep 5, 2016

SuperQ commented Sep 5, 2016

shlomi-noach commented Sep 5, 2016

SuperQ commented Sep 5, 2016

Frank-Leemkuil commented Oct 18, 2016 • edited Loading

shlomi-noach commented Oct 18, 2016

Frank-Leemkuil commented Oct 18, 2016

shlomi-noach commented Oct 18, 2016

Roguelazer commented Oct 18, 2016

shlomi-noach commented Oct 18, 2016 • edited Loading

Roguelazer commented Oct 18, 2016 • edited Loading

shlomi-noach commented Oct 19, 2016

taylorchu commented Nov 15, 2016

timglabisch commented Sep 15, 2017 • edited Loading

shlomi-noach commented Sep 17, 2017

timglabisch commented Sep 17, 2017

Roguelazer commented Sep 27, 2017

beanbee commented Aug 5, 2018

Frank-Leemkuil commented Oct 18, 2016 •

edited

Loading

shlomi-noach commented Oct 18, 2016 •

edited

Loading

Roguelazer commented Oct 18, 2016 •

edited

Loading

timglabisch commented Sep 15, 2017 •

edited

Loading