Monitoring SaaS Performance Is Critical. Here are 5 Key Points to Consider
On January 24, numerous European enterprises using Microsoft 365/Office 365 experienced email accessibility problems that locked some employees out of their accounts for a full day. A few days later, Gmail became inaccessible to wide swaths of email users in the UK, other parts of Europe, Africa and parts of Asia. Gmail was hit again on February 11. These incidents suggest a much-needed reality check.
Performance issues (speed and availability) for popular software-as-a-service (SaaS) solutions are not uncommon; however, this doesn’t make a slow or unreliable service any less painful for enterprise users. Think of popular SaaS-based CRM applications that essentially power sales, marketing, finance, customer service operations, and more. Organizations have become so dependent on these solutions that when performance lags, so does the entire operation’s productivity and revenue-generating engine.
Enterprise users often bear the biggest brunt—revenue and productivity losses, as well as disgruntled employees and customers. For this reason, organizations relying on SaaS must bear ultimate responsibility for ensuring strong performance for their employees. Here are some key points to consider:
1. Performance can (and often does) vary wildly across geographies. Just because your New York-based employees are enjoying fast, reliable experiences, does not mean the same is happening for your Los Angeles workers. There are a huge number of geography-specific elements—cloud provider data centers, content delivery networks (CDNs), local and regional ISPs, and more—that color employee experiences across diverse locations. A good place to start monitoring is your most critical employee locations. What are the largest offices? Where are SaaS applications used the most?
2. Monitoring from too few vantage points leaves blind spots. Similarly, if employees are having strong experiences on their work desktops, this is no guarantee that employees working on mobile devices are having an equally strong experience (especially true as mobile networks grow more constrained). It is important to monitor performance from not just multiple geographies, but multiple employee access scenarios and network vantage points.
Cloud monitoring (which measures the round-trip time of packets sent from a cloud infrastructure to a website or other digital service and back again) can help. However, in many instances this entails monitoring from the same infrastructure hosting the SaaS-based service. In this scenario, cloud monitoring provides a first-mile view only of SaaS performance. Sure, the SaaS provider’s servers may be up and running, but this is not an indicator of real-world reliability and speed, as packets must traverse globally dispersed networks and infrastructures en route to employees—and these journeys are often riddled with potholes.
3. Micro-outages and slowdowns are silent SLA killers. Micro-outages differ from global outages as they are shorter in duration (usually less than one hour) and impact isolated employee segments. Micro-outages are becoming more common as internet infrastructure expands at a breathtaking pace (ironically, to support more internet users and deliver stronger experiences). The larger the surface area grows, the more opportunities there are for localized breaks, and most micro-outages are caused by geography-specific elements beyond the SaaS provider’s firewall.
In an era where every experience matters, micro-outages can be even more damaging than a complete outage. Impacted employees are frustrated, not understanding why co-workers are able to access the application while they cannot. This angst grows when the SaaS provider’s dashboard shows the service as up and running.
The same thing happens when a major slowdown occurs. Maybe the SaaS provider upheld its “five nines” uptime promise, but what about the 10 hours when the service was running at 50 percent speed, causing users to abandon the application? Does the near-perfect SLA reflect the enterprise user’s lost productivity?
4. Fast MTTD and MTTR is paramount. SaaS performance issues are inevitable, so instead of focusing on perfection, enterprise users should focus on reducing mean-time-to-detect (MTTD) and mean-time-to-repair (MTTR).
Speed in these areas is critical even if the problem is not rooted in the enterprise user’s domain. If the problem is found to reside in the SaaS provider’s infrastructure or delivery mechanism, enterprise users can proactively flag it to the provider so it can be remediated. If it’s an issue somewhere in the internet wild, IT teams can at least inform employees and let them know they are working on it. If the problem is found to reside within the enterprise user’s side, the organization has an opportunity to address the growing hotspots before employees are impacted at all.
5. Many performance optimization opportunities exist behind the firewall. There are numerous opportunities for enterprise SaaS users to optimize performance by looking within their own firewalls. These are often not well known or understood, yet they are among the most easily and readily addressable.
One example is SaaS configurations—the administrative tasks IT teams must handle to get the SaaS application up and running, including employee onboarding, creating group memberships, delegating access privileges, and more. We have seen examples of SaaS performance improving tremendously based on how the service is configured for a particular employee location.
Additionally, there are easily deployable monitoring options that focus only on measuring performance and identifying issues within the firewall. This approach can be useful for smaller office locations that may be lower on the priority scale, but can still benefit from performance assurances and localized adjustments.
SaaS providers should be commended for providing an overall high level of service quality. The fact that they are able to do this as SaaS adoption grows so rapidly is impressive. But SaaS enterprise users would be ill-advised to put blind faith in these providers and believe they are absolved of ultimate performance responsibility. The right monitoring approach is essential to realizing the vast benefits of SaaS while continually identifying areas for performance optimization and insulating against risk.
Mehdi Daoudi is cofounder and CEO of Catchpoint, a digital experience monitoring (DEM) firm. Before Catchpoint, Daoudi spent more than 10 years at DoubleClick and Google, where he was responsible for quality of services, as well as buying, building, deploying, and using various internal and external performance monitoring solutions.