Ecommerce sites are now a primary touchpoint between businesses and their customers. For retail and B2B alike, sites must be highly available, and pages must load quickly to keep a loyal following. A reliable site offers a consistent and satisfying shopping experience. Buyers can browse, search and buy products anytime they choose and a site that delivers its content quickly leads to higher conversion rates and sales. How can you ensure the optimal buying experience for your shoppers? In large part, by having a proactive, disciplined approach to performance monitoring and using key metrics to plan and make better decisions.
Examples of Performance Monitoring
An eCommerce environment consists of many complex systems, most of which are dependent on others. A capacity issue with the database, for example, can manifest itself in higher than usual latencies at the front-end load balancer because the application may be waiting on results from the database. To the end-user, the site will be slow to respond. If this issue persists, frustrated visitors will likely direct their time and money elsewhere. A constant focus on key performance metrics help identify and resolve potential problems before they happen and lets you react quickly when the situation demands it.
Here are just a few examples of performance monitoring metrics that Tenzing pays close attention to on a continual basis. Even this short list of data points informs us when certain aspects of the system will need to be looked at more carefully and in short order.
- Request counts – an unusually high and sustained rate of requests from a small pool of machines can indicate a Distributed Denial-of-Service attack (DDoS). Once identified, measures can be taken to stop this flow of illegitimate activity.
- Latency at the load balancer – the time elapsed after a request leaves the load balancer until the HTTP(S) header of a response is received is measured in seconds. This metric is a good proxy for how quickly your application is responding to your site visitors.
- Storage queue length (application and database) – queue length is the number of I/O requests that are pending. A sustained increase in queue length may suggest the throughput of the storage medium is near, or at saturation.
- Database read and write latencies – measured in milliseconds, this metric records the average time for a read or write operation. Ideally, each operation should be completed in fewer than 20 milliseconds (ms).
Looking Back to Plan Ahead
Performance monitoring and the key metrics that come with it give Tenzing a historical record of how each component in an environment has responded to load over time. For sites that have reoccurring sales periods or other promotional peaks, analyzing past data forms a strong foundation for capacity planning for similar events in the future. Predictive analytics are vital and the more we have at our disposal, the more effective we can be at helping customers plan ahead.
Proactive vs. Reactive
Preparing a site for an expected traffic surge vastly improves the likelihood of reliability and performance. Combining this preparation with historic data will further improve your ability to plan and adapt to traffic trends. On the other hand, reacting to resource constraint issues in real-time is disruptive and time-consuming. Also, the panic that usually ensues often leads to mistakes. Keeping a watchful eye on the past and present helps you get ahead of the curve so changes can be made before capacity thresholds are exceeded and become problematic. Being reactive increases the odds of system downtime, lost customers, and potential damage to your brand’s reputation.
Strong performance monitoring provides eCommerce business managers and their IT teams the visibility needed about their site performance and aspects of security. However, it’s not a panacea for other fundamental issues in an environment. A couple of common conditions we often see are:
- Under-provisioned infrastructure designed to meet unrealistic budgetary targets – making the most of what you have is an understandable and prudent goal. However, when it comes to running a production-class eCommerce system, cutting corners to save a few dollars typically leads to disaster.
- Poorly written or tuned application code – a typical reaction to a slow-performing site is to add more infrastructure. The prevailing attitude is that the application simply needs an increase in horsepower to get the job done. While improvements can be achieved this way, the approach may not address the root cause of the problem. Inefficient code demands more resources than a properly tuned application.
The Bottom Line
From the technical perspective, many critical processes must all work together to ensure eCommerce success. However, few things are more important than strong ongoing performance monitoring and the ability to proactively adapt using the metrics that come from doing so.