How do you rebuild an airplane, at 40,000 feet, while it’s still packed with passengers? That was the question asked by Daniel Tao, head of engineering at Bitbucket, as he discussed moving the source shack to AWS data centers.
The group had to migrate 50 million repositories from the bitsheds of parent company Atlassian and send them over the billion or so daily interactions to Amazon’s cloud without falling into a heap. Oh, and it had to fix it during a global health pandemic.
Bitbucket Cloud has been around for over 10 years and, along with an on-premises version, is Atlassian’s take on source disputes. It expanded in April to include Open DevOps, built around Jira, Confluence, and Opsgenie, as well as Bitbucket, but, unlike the rest of the Atlassian line, had remained firmly in its own data centers.
The service was plagued by outages in 2019 and storage issues in 2018. Change had to be made. And not just the loss of support for Mercurial in 2020.
“The architecture has always assumed it would be in a data center,” Tao says. “And so we really had to redo key aspects of Bitbucket’s architecture and rebuild it in a new way in a cloud environment, while still operating our data centers.”
Rival DevOps outfit GitLab memorably sent its own data to Google’s cloud in 2019, shortly after Microsoft’s acquisition of GitHub.
As for the Bitbucket migration, it took 18 months from start to finish, with the last push taking over three hours at the end of August. The pandemic complicated things as the team had to deal with “the same curveballs thrown at every tech company just to learn how to work remotely and come up with new processes and rituals around that,” Tao said.
It also had to ensure sufficient capacity purchased and account for any last-minute problems in its plans.
From a technical standpoint, the biggest challenge was bandwidth, as customer data was first replicated from Atlassian’s servers to AWS’s. “Every time a customer pushed to their Git repository, every time a customer left a comment on a pull request, and so on, all that data was replicated in real time with a millisecond delay to our new environment in AWS,” he said. tao.
The downtime at the end of August was then only necessary to alert Bitbucket’s services to this new “source of truth”.
“The vast majority of our customers wouldn’t have noticed,” he claimed.
That’s all well and good from a technical standpoint, but some customers may not be too happy about having their data moved to AWS.
“In terms of data location,” said Robert Krohn, head of Agile and DevOps engineering at Atlassian. “Our data centers are in the same kind of region, so we didn’t have to inform our customers that we were moving stuff.”
That might raise an eyebrow or two, though Krohn added, “Some of the big clients we’ve talked to… but overall that wasn’t a limitation.”
While a jump into the land of Bezos may cause a sore in some developers, if someone has already signed up for Atlassian’s cloud products, chances are some of your data is already in AWS’ data centers. Atlassian’s internal Platform-as-a-Service, Micros, runs on top of AWS and hosts the majority of the company’s cloud products. This includes (after a few tweaks) Bitbucket.
“It was the final boss,” Tao noted.
And despite the switch to AWS, Atlassian is still gasping when things go wrong.
Unsurprisingly, Tao and Krohn liked to emphasize the improvements in performance and scalability after the migration. “Behind the scenes,” Tao says, “the number of incidents has fallen to zero.” A look at the company’s status page reveals just that one problem that occurred around pipelines earlier this week, since the last switch in August.
However, there are still those pesky data centers, now redundant and cluttered with customer data, to deal with. “We have to take the data on those hard drives very seriously,” Krohn says, “and we go through a process to securely destroy them in a certified manner.”
But what about the remaining hardware? “We’ve had parties where we’ve sent people over and said, oh, go cut the cables,” Krohn said.
Because no matter how carefully you plan your migration and what precautions you have to take, let’s be honest: there is no party like a DC decommissioning party. ®