Website and IT outage can cost businesses $5 million an hour. Yet even with this motivation, half of organizations admitted repeated outages and slow-downs in a new survey.
The good news is that many companies recognize that they need to do better. Leading enterprises understand that performance must be a core cultural value—and the practice of performance engineering is evolving as a result.
Performance failures are expensive
In 2012, Knight Capital's computers started to fail in the worst possible way. Instead of shutting down gracefully, they began issuing commands to buy and sell securities. The orders were queued up to be executed over the coming weeks, but the errant computers dumped them on the market all at once, causing a buying and selling frenzy like a clerk who'd gone insane.
It only took a half hour, but by the end the losses totaled an estimated $440 million. CNN asked whether it was the most expensive computer glitch ever, but somehow "glitch" doesn't seem adequate to describe an event that almost destroyed Knight Capital.
We may never know if this particular glitch will hold the record, because many similar failures aren't as visible. Orders to buy and sell stocks are highly regulated and can't be swept under the rug, but many other failures are invisible. They only slow down or halt a website without leaving much of a trail. The customers show up, click in vain, then move on. No one really knows how many sales are lost, because it's impossible to know what the customers might have purchased if the website had responded quickly.
Building a culture of performance
Many companies are recognizing that it's not enough to call something a glitch, cross their virtual fingers, and hope it never happens again. They need to track their computer systems' online status; guarantee that they're responding to customers, partners, and employees; and measure whether they're delivering the information promptly so no one is clicking, tapping, pounding the return key, and wondering whether they should just go somewhere else.
As corporate leaders realize the importance of their online and mobile presence and start to measure just how much business comes in through all of their channels, they're remaking their organizations to recognize the importance of keeping information flowing quickly and accurately. This responsibility is gaining traction under the umbrella of performance engineering.
Proponents of this new vision believe that enterprises must build a performance-focused culture throughout their organizations, one that measures the experience of end users, both internal and external, and deliver a software and hardware experience that results in great performance to them. Performance must be prioritized from the beginning of the process and be watched vigilantly after the code is deployed.
The state of performance engineering
In April 2015, the independent research firm YouGov launched a broad survey sponsored by Hewlett Packard Enterprise. The survey sought to understand how organizations were embracing performance engineering and building a culture that prioritizes it.
The survey reached 400 different US companies, each with at least 500 employees. It asked a series of questions designed to capture the state of performance engineering. Fifty percent of survey respondents were performance engineers and/or testers, 25 percent were application development managers, and 25 percent were IT operations managers—all with direct responsibility for building and maintaining systems to deliver business value to their end users and shareholders.
Of the respondents, 70 percent agreed that performance engineering is growing in importance, often rapidly. Two-thirds cited complex composite applications as one reason for this trend. Many noted that a single web application is often an increasingly complex mechanism that knits together data from multiple sources. The world seems to be shifting from single applications to complex constellations of web services. If developers are responsible for individual services or apps, there needs to be someone in charge of keeping the entire system running smoothly.
Seventy percent agreed that the importance of performance engineering is increasing. Two-thirds cite complex composite applications as a reason.
The rise in importance of performance engineering is driven by practical concerns. At least 50 percent of respondents admitted that slowdowns and outages were discouraging customers and frustrating employees. Many characterized the problems as "repeated," and said they were often caused by large spikes in traffic that weren't anticipated when the applications were built.
The consequences are serious. The average firm that responded to the survey said that a major outage could cost between $100,000 and $500,000 in lost revenue per hour. Some of the larger companies with more than 10,000 employees said they could lose $5 million an hour from website or core system outages.
A major outage can cost an average firm $100,000 - $500,000 per hour. Large enterprises can lose over $5 million per hour.
When organizations contemplated the scope and catastrophic range of these failures, they recognized that the traditional development process just wasn't ready to build a system with adequate provisions for surviving these kinds of issues. Transforming the organization and focusing individuals and teams on performance means empowering them with capabilities to anticipate problems and solve them before they occur. And when problems emerge after deployment, it means giving the team the ability to control failure and mitigate risk.
How do you define "performance engineer"?
"Performance engineering" doesn't refer only to a specific role. More generally, it refers to the set of skills and practices that are gradually being understood and adopted across organizations that focus on achieving higher levels of performance in technology, the business, and for end users.
While each company and team is defining the tasks associated with performance engineering a bit differently, they're still finding some agreement on the broad scope of the work to be done. Seventy percent of respondents felt that the most important aspect of performance engineering was developing "post-deployment performance testing." This means building capable instrumentation to ensure that everything is running smoothly; and if an anomaly does arise, it means being able to work together to address the issue and resolve it quickly.
But that isn't by any means the entire job. Seventy-four percent of the IT operations managers also strongly agreed with the statement, "Collaboration with development teams is vital for effective performance engineering." They felt strongly that working with programmers from the beginning would ensure solid performance. It isn't enough to show up when the code ships, pour some coffee, put your feet on the desk, and watch the dashboards and alerts for problems.
Indeed, 60 percent of respondents agreed strongly with the statement that performance engineering is "mostly proactive." The goal is to watch over the entire process and look for issues that might occur. Teams need to anticipate when databases might balk, third-party services might fail, and systems might fail to load or begin to slow down.
Many also indicated that the role of performance engineer is becoming more important. When asked whether modern capabilities and tools like load balancers, automatic failover systems, and redundant servers would reduce the need for performance engineers, only 52 percent of survey participants agreed.
More respondents felt that the rise of mobile applications and cloud-based solutions and services is actually making the tasks associated with performance engineering more challenging and vital. Mobile applications work through more fragile, often jittery high-latency and low-bandwidth mobile networks. These networks often hamper back-end performance because of delays and sessions being held open longer, along with a long host of other challenges that significantly increase the demand on back-end infrastructure resources. Yet customers often expect zippier responses from the more simple interfaces. Failure is more glaring because it's the only item on the screen.
Fifty-seven percent of IT ops managers also felt that cloud-based applications require more performance engineering, if only because the interaction with the cloud adds even more complexity to back-end infrastructure.
In the past, most people only interacted with cloud services occasionally, perhaps to load one piece of data or store a response. Modern cloud applications poll the cloud frequently, and often the entire system is hosted in the cloud, or a task is handled as a third-party service call to the cloud. Thus, failures are less apparent, increasing the need for performance engineering.
A broader understanding of performance engineering
The survey responses revealed a consistent theme: application development and IT operations managers, along with business leaders and end users, are recognizing that they must include performance engineering and its key practices in the broader organization and culture throughout the product lifecycle. The survey identifies many reasons for this, including the complexity of composite applications and increased end-user expectations for responsiveness—anywhere, at any time, on any device, for any length of time.
Team members most strongly associated with performance engineering might be thought of in a variety of capacities, all vital to business. They might be financial engineers who are ensuring that revenues will keep the enterprise afloat. They could be employee-empowerment engineers who ensure that everyone has the tools to keep the enterprise running. Or, they could be customer-satisfaction engineers who maintain user happiness.
That's a big collection of hats and spinning plates, and that's why a greater understanding of performance engineering as a cross-discipline, intra-business mindset is becoming so essential.
Keep learning
Take a deep dive into the state of quality with TechBeacon's Guide. Plus: Download the free World Quality Report 2022-23.
Put performance engineering into practice with these top 10 performance engineering techniques that work.
Find to tools you need with TechBeacon's Buyer's Guide for Selecting Software Test Automation Tools.
Discover best practices for reducing software defects with TechBeacon's Guide.
- Take your testing career to the next level. TechBeacon's Careers Topic Center provides expert advice to prepare you for your next move.