Even after sponsoring, exhibiting, and walking the floor at Velocity every year since the first conference in 2008, we learn something new at each conference.
1. Sorry, we still can’t capture the metric that matters most from real users.
Users expect vastly better performance than most sites deliver, and the testing methodologies employed in the last decade have painted an artificially rosy picture. (Sorry, your site doesn’t load in two seconds for your users – not unless they routinely browse the web from 16-core servers plugged into top of rack switches at major Internet exchanges.)
RUM, or Real User Measurement, is much more realistic than Synthetic testing from nodes sitting on the Internet backbone. That’s not news – but RUM can only report what the browsers expose via APIs such as Navigation Timing and Resource Timing.
The entire Web Performance Optimization industry (WPO, FEO, FEA) agrees that DOM events, networks events, and pretty much every “milestone” benchmark – a moment in time measured in seconds elapsed since the page began loading – are poor proxies for the end-user’s perception of performance. Even visual milestones like “Start Render” (when the visible portion of the page starts to render) and “Viewport Complete” (when the visible portion of the page finishes rendering) don’t fully capture the “feeling” of performance. A page that loads 80% of the content within 1 second, then the remaining 20% within the next 5 seconds, will feel faster than one that loads the other way – 20% within 1 second, and 80% in the next 5 seconds – yet both would have the same Visual Milestone timings.
Luckily for the WPO industry, Patrick Meenan of WebPagetest fame and his team came up with a Visual Progress metric – Speed Index – which addresses this performance inequity and finally gives credit to websites that focus on loading as much “Above The Fold” content as early as possible – even if that means delaying certain DOM and network events, like the onload event or Fully Loaded Time (Time to Last Byte, or Waterfall Complete). After all, users don’t care when their NIC card stops blinking – just what they can see on their screen.
There’s only one problem – WebPagetest is the only performance tool to compute Speed Index. Major vendors in the space, like Keynote and Compuware, haven’t adopted this type of metric yet, and browsers don’t expose visual progress benchmarks via API. So we have a gap in our ability to benchmark— limited support for Speed Index and nothing visual. This leaves us with the following options, from most to least desirable:
- RUM of Network/DOM Events (growing rapidly in popularity)
- Synthetic Visual Progress (currently only available in WebPagetest as Speed Index)
- Synthetic monitoring of Visual Milestones (e.g. Start Render or Visually Complete)
- Synthetic monitoring of Network/DOM Events (still the most common, unfortunately)
However, capturing Visual Milestones and Visual Progress with RUM would be even better than these any of these four options. Unfortunately it’s not possible today. What if browsers could compute Speed Index or something very similar, in real-time, and with low overhead and expose the information via a Timing API?
And want to know something? It’s not unheard of. In the online video space (where we spend a lot of time obviously) companies have been analyzing performance in real-time for years by integrating technologies into the video players. Maybe the browser folks can take a cue from the player folks.
2. There is no room for redirects on mobile.
If we want to break Ilya Grigorik’s 1,000ms “time to glass” barrier, mobile redirects need to be redirected… into oblivion. Eliminating redirects on your mobile site is the only way to keep response time under 100 ms and viewport complete under 1,000 ms. Put best by Ilya himself, “One redirect on most mobile devices and you’ve already blown your budget.” Yet on many sites, your device has to follow multiple redirects just to load the mobile homepage.
I’m sure you’ve seen something like this:
Keeping your web server configuration file as short as possible may have its merits, but in the case of redirects, it’s not worth the price your users have to pay on every visit.
Here are some resources you might find useful to help you tackle the re-direction nightmare:
3. Image optimization is tough.
Images are getting heavier, making up the majority of bytes delivered, with averages approaching 1 MB per page, drastically increasing load times. And guess what? Optimizing them is complex.
Unfortunately there is no one “best” approach. It all depends on the client and the method of consumption. For example, if a common use-case is saving images to disk and sharing them, WebP is probably not ideal, as this extension doesn’t have an associated viewer on most clients. Paying close attention to the size of images and the devices that need to display them is crucial, as it will cause a dramatic difference in load time and responsiveness.
Since there is no universal “best” format or optimization technique, adaptive behavior and automated application of optimizations are the only solution.
Here’s a deep dive into the world of image optimization— http://www.datacenterknowledge.com/archives/2013/06/19/photos/
4. Don’t fall for the first certificate authority you meet.
Your choice of CA (SSL Certificate issuing Authority) is very important, and the temptation is to minimize selection criteria. But OCSP and CRL performance vary wildly by country, and there’s a huge variance even on a globally averaged basis.
The advice? Don’t choose on price and reputation alone. x509 Labs has published data about OCSP and CRL performance. Look at the countries that matter the most to you, and ensure that your CA performs well there.
5. Yesterday’s “best practice” could be today’s “anti-pattern.”
Browsers and devices are constantly evolving, working around bottlenecks (and introducing new ones). For example, Domain Sharding first emerged to circumvent older browser’s “2 connections per server” limit and increase parallelism. Modern browsers have raised that limit to 6+. Unconditionally implementing this “best practice” can cause client CPU and home router limitations. In most cases, 2 connections still yields a benefit and 4 is the point of diminishing returns.
Want to learn more? Check out this presentation from O’Reilly (especially slide #12)— http://cdn.oreillystatic.com/en/assets/1/event/94/Top 10 WPO Disasters_ Don’t Let This Happen To You Presentation.pdf
Performance optimization is a complex, hairy beast that evolves too quickly for most people to keep up with it. It’s no surprise that the world’s leading web performance experts have shifted from advocating “hand-tuning” to implementation of automated web performance optimization technologies.
Shameless plug time: Did I mention that Limelight Orchestrate Performance offers full symmetric dynamic content acceleration with automated web performance optimization?