In a previous post, we examined the importance of monitoring and improving site performance from an end-user perspective, and what can happen when the user experience is neglected. Now, let’s switch into action mode to answer the when, how, and what of performance monitoring.
When to Monitor: Analyzing the Risks
In an ideal world, software developers would spend their limited time designing and implementing new features and functionality. But in the messy real world of application development, much of their time is taken up resolving defects. And the later these defects appear in the development cycle, the costlier they become. According to IBM’s Systems Sciences Institute: The cost to fix an error found after product release was four to five times as much as one uncovered during design, and up to 100 times more than one identified in the maintenance phase.
Moreover, we find that the majority of defects, 25% and 35% respectively, creep in during the design and coding phases of the software development lifecycle. In the past, developers had to manually search for these errors, spending hours on making assumptions, and then predicting and locating the source of a bug. However, as applications expand to include third party functionality, distributed content, and microservices, this task of defect prevention and remedy becomes more and more complex. And in the race to build speedy experiences faster than everyone else, every second counts.
Division of Defects Introduced into Software by Phase
Software Development Phase
PERCENT OF DEFECTS INTRODUCED
Source: Computer Finance Magazine
Also, recall as you begin to focus on the user experience, you must make the paradigm shift to move away from the simpler functionality binary question of “Does the feature work?” (Yes or no) to the more complex questions of functionality and performance. From both an improved user experience and a cost-benefit perspective, you need to shift to a more proactive and preventative approach by progressing the initial functionality question to include: “How well does the feature work, and does it affect the performance of my site?” Much like health, in software prevention is the best cure.
Move Monitoring Left
In examining both the cost of a defect once it reaches the maintenance phase and the percent of defects introduced during coding, it makes the most sense to move performance monitoring to earlier phases in your development lifecycle – into the phases of development and implementation. To detect and correct performance problems before they impact users, you need to shift your battle plan to include monitoring as a part of the software development process, rather than just the software maintenance process.
How to Monitor: Shifting into Development
Moving monitoring into non-production environments means that you’ll need to select a synthetic (active) monitoring solution that simulates user actions and traffic to identify problems and evaluate slowdowns before the problem affects users and customers. Synthetic monitoring is best used to simulate typical user behavior or navigation to monitor the most commonly used paths and business critical processes of your site.
We recommend using the three best practices below as a starting point to make the shift into monitoring performance during your development phase:
- Monitor Pre-production Environments: To get a better handle on defects earlier in the process, you need to monitor performance in your pre-production environments. Problems can also arise when development and operations try to reproduce issues in different environments, so you’ll want to use the same monitoring solution in your pre-production that you use in your production environments.
Additionally, ensure that your teams are using the same set of KPIs to evaluate both environments. With that said, try not to compare your development and production environments as apples to apples. Use your monitoring solution to establish baselines for both environments and understand how these baselines relate to one another. Know that regression in one development will lead to regression in production, even though the regressions likely are not identical.
- Track Code Changes within the Solution: Help identify the causes of regression by tracking code changes in the tool itself. When choosing a monitoring tool, make sure the system can annotate development changes and track performance changes before and after deploys.
- Evaluate Performance Regression Often: Look for performance regressions with every meaningful engineering event. Define the severity of defects you want to track to cut down on alerting noise and hide low or known issues. Set alerts to notify you if performance degrades or improves after a deploy.
What to Monitor: Developing Your KPIs
Shifting your perspective on when and how to monitor will also have an effect on what you want to monitor. Historically, web performance monitoring has focused on availability, initial response times, uptime, etc. Again, those are all great stats, but moving into a more proactive position requires that you reevaluate how you are assessing performance. While you should use the same KPIs that you use in production in your pre-production environment (such as those listed above), make sure you ask and understand all the KPIs.
For example, as an eCommerce provider, you may want to track the time to an interactive because this is when people can add items into a cart (read: sales and revenue). However, your Ops team may only be concerned with general availability and page load timings. Start to grow your monitoring strategy beyond general performance KPIs, such as render time, load time, page size, and number of resources, and make sure you’re also establishing KPIs that align with your business needs.
Rigor Monitoring tracks and analyzes KPIs vital to your business.
Begin by evaluating your business needs: Are you an eCommerce provider who cares about how site performance affects shopping cart abandonment and average order values? Or a content provider who’s concerned about sluggish ads negatively affecting your user experience?
Sample industry-specific KPIs include:
- Time to first item added to shopping cart
- Shopping cart abandonment
- Conversion rates of product pages
- Media/Content Provider:
- User experience KPIs such as SpeedIndex
- Start render
- Number of pageviews per session
- Include business metrics like time to first ad
- Average number of ad impressions per session
- Enterprise and SaaS Providers:
- Time to First Byte
- Server/backend time
- Conversion rates
- Time for critical application flows
To learn more about monitoring and how it affects continuous performance, check out our eBook on Building Faster Experiences with Continuous Performace.