



Where Bigtable Comes

As mentioned earlier, after researching the options, Bigtable was the best solution for our needs. It provided us with the features we were looking for:

99.999% SLA

infinite scale

Single digit millisecond delay

Built-in monitoring system

Multi-region replication

Geographic distribution. Enable seamless replication of data across regions and reduce latency.

On-demand scaling of compute resources and storage. It can be tuned for user traffic and the system can grow and expand as needed.

Seamless integration with our common architecture. We use Google Cloud services for many other parts of our system, such as the APIs that interact with these databases.

NoSQL databases, which do not require relational semantics, are indexed on a single primary key within the application as datasets are migrated.

I take action

We targeted three major self-managed datasets for migration. The large two were organized in a sharded database architecture. The first thing I did for the migration was prepare a new Bigtable database. We iterated the schema design process and conducted a thorough performance analysis of Bigtable to ensure an uninterrupted user experience during and after migration. After that, we made some minor tweaks to the application code to seamlessly integrate and interact with Bigtable. Finally, we implemented a robust post-migration disaster recovery process to mitigate potential risks.

Allowed the application to initiate the dual-write phase during the actual migration. This required writing new linked data to both an existing MySQL table and a new Bigtable table simultaneously. Once data began to be written to the Bigtable instance, I ran the migration script. Using a Go script, I went through each of the existing MySQL datasets and inserted each row into Bigtable. This allowed us to clean up old information and backfill old records with new field data.

clean up clutter along the way

During the migration process, we were able to free up a really large amount of storage. Since early features of the Bitly platform were removed, we were able to exclude just under half of all data stored in MySQL from being migrated to Bigtable. I was creating a completely clean dataset, so I could simply skip the unwanted rows during the migration.

A total of 80 billion MySQL rows were run through the migration process, resulting in just over 40 billion records finding new locations in Bigtable. After all, Bigtable’s starting point is a 26 TB dataset without replication. By running a series of concurrent Go scripts in parallel on a handful of machines, we were able to complete this migration project in 6 days. (Go rarely disappoints.)

Write twice, cut once verification before migration

Next was the data validation and cutover period, which started returning data from Bigtable but continued writing to MySQL in case a rollback was needed.

As we entered the validation process, we compared data between MySQL and Bigtable and noted discrepancies whenever a link was clicked or created. After confirming that all responses were stable, we proceeded with a gradual cutover process, rolling out percentages until Bigtable reached 100% for all writes and reads. After a comfortable running period, he turns off dual-writes completely and eventually decommissions his main MySQL host to reside on an upstate farm.

Google got Bitlys back (up)

Our data is our lifeline and we do our best to ensure that it is always protected. We put together a redundancy plan using both Bigtable backups and a process to keep a copy of the data outside of Bigtable for true disaster recovery.

Your first line of defense includes switching to a backup Bigtable dataset in case you need it. Additionally, we implemented two additional layers of defense to protect against instance failures, data corruption, and other data failures that require restoring one or more tables from backup.

The process begins by making daily Bigtable backups of the table and keeping them for a certain number of days. Then run a Dataflow job to export data from Bigtable to Cloud Storage approximately every week. You can also use Dataflow to import data from Cloud Storage back into a new Bigtable table, if needed.

While running Dataflow jobs that export from Bigtable to Cloud Storage, we observed impressive export speeds averaging 7-8 million rows read per second, sometimes up to 15 million rows. During that time, production reads and writes continued uninterrupted. When testing a Cloud-Storage-to-Bigtable restore job, we observed write speeds increasing with instance size at the maximum regional node quota, averaging just under 2 million rows per second written to new tables, as expected.

Short links for long distances

As mentioned above, we chose Bigtable not only because it met our technical requirements and operational needs, but because it provided us with future growth. The ability to scale seamlessly over time while improving the system’s availability SLA was a key factor in our decision.

As you grow in size by 5x, 10x, or more, it’s imperative that you scale your data backbone accordingly to keep the SLAs you offer your customers stable or add more coveted ‘9’s. We have big plans for the next few years, and Bigtable can help make them happen.

Interested in learning more? During our evaluation and eventual adoption of Bigtable, we found the following resources helpful.

Bonus: If you want to learn more about how we migrate your data, we’ll be talking all about that very topic at GopherCon 2023 in San Diego in September.

