The widespread BlackBerry outage in North America last week inconvenienced some corporate users of the popular wireless email service. But what most irked them was the lack of an immediate explanation or frequent updates about the outage from BlackBerry vendor Research In Motion (RIM)
The outage began on Tuesday evening, but RIM didn’t issue a public statement acknowledging it until Wednesday afternoon – after the service had been restored. And it was late on Thursday before the company explained that the problem was triggered by a flawed installation of cache optimisation software and a subsequent system fail-over snafu.
The vendor apologised to BlackBerry users for the inconvenience – which didn’t affect BlackBerry users in Europe. But customers such as David Maynor, chief technology officer at Errata Security in Atlanta, sharply criticised RIM for its initial silence about the outage.
“I’m actually really mad about it,” Maynor said on Wednesday. “I’m mad enough to switch to another service. Everyone makes mistakes, but [RIM’s] cardinal sin is that they didn’t inform their users.”
Maynor, who has been a BlackBerry user for the past three years, said he lost his service from about 14 hours. Wednesday. He added that he couldn’t get information about the outage from the BlackBerry website or his telecommunications carrier. Instead, he turned to online discussion forums.
John D Halamka, CIO at both CareGroup HealthCare System and Harvard Medical School in Boston, said on Friday that he was satisfied with RIM’s explanation and apology. But he said he hopes that in the future, the vendor will be “more proactive about acknowledging problems and communicating with their customers”. Halamka pointed out that as of Friday morning, RIM’s BlackBerry and corporate websites still had no information about the outage.
Earlier in the week, Halamka said that about 500 of his users lost BlackBerry service for 11 hours starting at 8p.m. EDT Tuesday. The users were able to switch to mobile phones and web-based email during the outage, so they could continue to function at work, according to Halamka.
But, he added, CareGroup’s BlackBerry user base “includes many clinicians who use the devices as part of patient care and need to ensure [the devices’] security and reliability. If the downtime was caused by an external security breach or an internal problem with change control [at RIM], we need to know about it”.
In its brief initial statement, RIM said that a “service interruption” had affected email deliveries to users in North America starting on Tuesday evening but that service for most customers had been restored overnight. Phone calls made on the BlackBerry handsets weren’t affected, it said.
RIM issued another short statement Thursday afternoon, saying it was still analysing the cause of the outage. Finally, later that night, the company pointed to the installation of a cache optimisation program on a server that powers the BlackBerry service.
Testing of the software “proved to be insufficient”, said RIM, adding that the installation “triggered a compounding of interaction errors between the system’s operational database and cache”. The company’s IT staff was able to isolate the problem but couldn’t fix it. And then, RIM said, a fail-over process that had been tested repeatedly didn’t work as expected.
The company added that it plans to bolster some of its testing, monitoring and recovery processes in an effort to prevent repeat episodes.
RIM officials couldn’t be reached for further comment on any of the statements, despite repeated attempts via phone calls and email.