CockroachDB: v19.1 to v23.1 Migration via CDC-Sink
By Morgan Winslow at
Using the cdc-sink project to migrate data in CockroachDB v19.1 to v23.1
Overview
In this blog I'll be going over the steps required to use the cdc-sink project to migrate a "Self-Hosted" CockroachDB cluster running v19.1, to a "Dedicated" cluster running 23.1.
Using cdc-sink is an alternative route to the more traditional upgrade path, which would require incremental upgrades. It is not possible to go straight from 19.1 -> 23.1. The complete upgrade path would look like the following:
19.1 -> 19.2 -> 20.1 -> 20.2 -> 21.1 -> 21.2 - > 22.1 -> 22.2 -> 23.1
While this is certainly doable, it's a number of steps with risk involved. An alternative path could be cdc-sink to move the data only. This tool can leverage Change Data Capture (CDC) to pipe data out from a source CockroachDB to a target CockroachDB. I would encourage you to read more about the tool, its feature set, and its limitations in the documentation linked above.
Disclaimer: cdc-sink is a powerful tool but not officially supported by Cockroach Labs except by prior arrangement
High-Level Steps
- Create v19.1 Self-Hosted Cluster on AWS
- Create v23.1 Dedicated Cluster
- Download cdc-sink project
- Alter one line in cdc-sink project and create binary
- Initialize source database with data and start running a workload (tpcc)
- Start cdc-sink in 'immediate' mode and start changefeed in source cluster
- Monitor progress in target cluster
- After initial dump has made significant progress, stop changefeed
- Start new changefeed in 'transactional' mode
- Stop workload and cutover to target cluster. Compare source and target cluster data to ensure consistency
Create Clusters
Self-Hosted - v19.1
For my self-hosted setup, I started a 3 nodes cluster with all nodes running on a single EC2 instance in AWS. This isn't a true production setup, but will get the job done for the purposes of this test. You can follow the steps outlined here for this portion of the exercise.
Dedicated - 23.1
The Dedicated offering is a fully managed offering of CockroachDB that is deployed on cloud infrastructure. You can follow the steps outlined here to create a Dedicated cluster. This exercise could also be completed using a Serverless cluster.
I used a Dedicated GCP cluster with 3 nodes, 8vCPU per node, and 400GiB disk per node.
Download and Installing cdc-sink
cdc-sink is a golang binary and can be installed from source:
go install github.com/cockroachdb/cdc-sink@latest
$HOME/go/bin/cdc-sink version
You can read more about installing cdc-sink here.
Alter Regex
Here I will outline one change I did have to make for this process to work. Instead of doing this process manually, feel free to grab the updated binary here.
The file output format underwent a change between v19 and v23, which caused an error in how staging tables were named. I won't go too into the weeds here, but we were able to get around this by changing a regex in the ndjson.go file.
Before:
ndjsonRegex = regexp.MustCompile(`^/(?P<targetDB>[^/]+)/(?P<targetSchema>[^/]+)/(?P<date>\d{4}-\d{2}-\d{2})/(?P<uniquer>.+)-(?P<topic>[^-]+)-(?P<schema_id>[^-]+).ndjson$`)
After:
ndjsonRegex = regexp.MustCompile(`^/(?P<targetDB>[^/]+)/(?P<targetSchema>[^/]+)/(?P<date>\d{4}-\d{2}-\d{2})/(?P<timestamp>[^-]+)-(?P<topic>[^-]+)-(?P<schema_id>[^-]+)-(?P<uniquer>.+).ndjson$`)
Now that we have updated the file, we can build a new binary.
go build .
Special shout out to my colleague, and primary contributor to cdc-sink, Bob Vawter for spoon feeding me this fix.
Prepare Target Database
Create _cdc_sink staging database.
create database _cdc_sink;
ALTER DATABASE _cdc_sink CONFIGURE ZONE USING gc.ttlseconds=300;
Create empty tpcc database and drop foreign key constraints.
cockroach workload init tpcc 'postgresql://root@0.0.0.0:26257?sslmode=disable'
alter table customer drop constraint customer_c_w_id_c_d_id_fkey;
alter table district drop constraint district_d_w_id_fkey;
alter table history drop constraint history_h_c_w_id_h_c_d_id_h_c_id_fkey;
alter table history drop constraint history_h_w_id_h_d_id_fkey;
alter table new_order drop constraint new_order_no_w_id_no_d_id_no_o_id_fkey;
alter table "order" drop constraint order_o_w_id_o_d_id_o_c_id_fkey;
alter table order_line drop constraint order_line_ol_w_id_ol_d_id_ol_o_id_fkey;
alter table order_line drop constraint order_line_ol_supply_w_id_ol_i_id_fkey;
alter table stock drop constraint stock_s_w_id_fkey;
alter table stock drop constraint stock_s_i_id_fkey;
Prepare Source Database
The source database needs rangefeeds enabled and an enterprise license.
SET CLUSTER SETTING kv.rangefeed.enabled = true;
SET CLUSTER SETTING cluster.organization = 'your-org-here';
SET CLUSTER SETTING enterprise.license = 'your-key-here'
Create tpcc database and start loading data.
cockroach workload init tpcc 'postgresql://root@0.0.0.0:26257?sslmode=disable' --warehouses=100
cockroach workload run tpcc 'postgresql://root@0.0.0.0:26257?sslmode=disable' --warehouses=100 --duration=60m
Start CDC Sink
./cdc-sink start --bindAddr :30004 --disableAuthentication --targetConn "postgresql://root:password@morgan-cdc-sink-k2h.gcp-us-east1.cockroachlabs.cloud:26257/?sslmode=verify-full&sslrootcert=morgan-cdc-sink-ca.crt" --logDestination "cdc-sink-log" --immediate -v &
Create Changefeed on Source
create changefeed for table tpcc.customer, tpcc.district, tpcc.history, tpcc.history, tpcc.item, tpcc.item, tpcc.new_order, tpcc.order, tpcc.order_line, tpcc.stock, tpcc.warehouse into 'experimental-http://127.0.0.1:30004/tpcc/public?insecure_tls_skip_verify=true' WITH updated, resolved='10s';