Atlassian outage continues, some customers fear data loss • The Register
Atlassian is still trying to recover from a recent software script fiasco and hopes no customer data is lost, which could be more than Microsoft can handle if OneDrive, as some have reported, has been corrupting large uploads intermittently for at least two months.
Four days after some Atlassian customers began experiencing problems with the cloud giant’s collaboration software, recovery efforts continue and a few people fear they may not get their data back.
One wrote to The register wondered what that possibility was after the company responded, via Twitter, to a request to confirm customer data was backed up and failed to do so.
“We expect most site recovery to occur with minimal or no data loss,” says the biz said on Thursday.
Minimal data loss is of course not the same as new data loss, and this is understandably distressing for those involved.
“This is extremely concerning to us, as our mission-critical institutional knowledge is present in Confluence at this time,” an Atlassian customer, who asked not to be identified, said in an email to The register† “And that this message conflicts with the ‘maintenance script disabled a small number of sites’ message that we keep getting. This would also explain why it took days with so many technicians working 24/7.’”
Atlassian Jira, Confluence outage continues for two days
The register inquired Atlassian about the IT outage, which still appears to be an ongoing matter on the company’s status page. Near the end of the day on Thursday, the Atlassian spokesperson made a reassuring but similar statement.
“We will continue to investigate and resolve the incident,” the spokesperson said. “At this point, we believe potential data loss will be minimal to none. We are working hard to resolve the incident and get customers back online.”
We were also told that the incident affects a relatively small number of Atlassian customers: about 400. That’s just 0.18 percent of the company’s 226,000 customers, which is no comfort to the hundreds who still don’t have access to their data.
When we got back to our reader on Friday, the issue was still not resolved.
“No data has been restored to us yet and these services all remain unreachable,” said our source. “We have not been given an ETA on entry.”
When the service is fully restored, Atlassian may provide a detailed report on what happened using its Incident Postmortem template.
Meanwhile, Microsoft’s OneDrive has apparently been corrupting large, multi-part file uploads intermittently for at least two months.
Reports flagging the issue date back to early February, where they surfaced in a forum post for the backup app Duplicati. Users of another backup app, rclone, started arguing about what appears to be the same issue on March 24, 2022.
Three days ago, Nick Craig-Wood, creator of rclone, posted a bug report on the GitHub repo for Microsoft’s OneDrive. “Sometimes (maybe once in 20) multipart uploads of a 128MiB file get corrupted,” explains his post.
Several other people say they reproduced this occasional bug.
“During testing this morning, I still see the problem,” wrote GitHub user “rleeden”. “I created a random test hierarchy of 100 files with random data between 128M and 256M, and uploaded it to OneDrive via the web interface.”
“When I check the sha1sum of the original files and the files on OneDrive, I see that 16 out of 100 files are corrupted.”
On Thursday, this person was unable to reproduce the error, suggesting that someone at Microsoft may have fixed the bug.” And Craig-Wood also reported that he no longer saw any corrupt files.
The register asked Microsoft, which was programmatically notified of the bug report when it was submitted, if anyone at the company had fixed OneDrive, and if so, if there will be an announcement for OneDrive users who may not realize that some of their uploads may be damaged.
We haven’t heard back.
Updated to add
Microsoft got back to us to say that the OneDrive issue has apparently been resolved.