Web of bloat
Updated at by ospiOn the subject of ever increasing average page weight I decided to test few popular sites. On top the usual fixes (minimize, gzip, crush and lazy load media etc.) I came up with few more maybe worth considering. And do check out an awesome talk by Maciej Cegłowski about The Website Obesity Crisis.
Client requests and bandwidth w/ caching
The bloat problem can be effectively cut down with good markup and cache control. Ideally page markup would initiate fetching of most resources (apart from eg. lazy loading images/media) but in reality the initiators are spread all and laden with cache-busting random strings. A 3rd party javascript initiating download of another 3rd party javascript leaves the original site with zero content control. And as then number of ad-tech partners hit double digits it's a security (and surveillance nightmare) for the clients.
Testing sites with clean cache, without images and media and with 3rd party stuff blocked using Ghostery.
Site | 1st view | 2nd view | w/o media 2nd view | w/o trackers (number of blocked) |
---|---|---|---|---|
New York Times | 300 req / 2.6 MB | 470 KB | 385 KB | 173 KB (38) |
Gizmodo | 700 req / 2.4MB (+40 MB video) | 790 KB | 690 KB | 83 KB (30) |
European Parliament | 160 req / 2.9 MB | 800 KB | 97 KB | 40 KB (5) |
Github | 20 req / 722 KB | 10 KB | 10 KB | 10 KB (1) |
Hacker News | 6 req / 10.9 KB | 5.7 KB | 5.7 KB | 5.7 KB (0) |
This blog | 8 req / 14 KB | 4.5 KB | 4.5 KB | 4.5 KB (0) |
Ad-driven sites like New York Times and Gizmodo are quite traffic and resource hungry, but even if you're not depending on ad revenue like The EU Parliament, you can effectively hose your citizens with 3 megabytes of nothing (2nd view is much better thou). GitHub sets a good example of smart caching as the second load is only 2% of the initial. Hacker News' no nonsense headline list is blazing fast and this site is just for reference as a no-content site. For mobile users their data package might just make them steer away. What Does My Site Cost for average pricing of your site in other countries.
CPU cycles, instructions and task-clock
I tested some of the same sites while recording CPU cycles, instructions and task-clock with perf
. Chrome disowns the browser window so I had use a tiny script to dig up the new process which in turn would be slaughtered in 10 seconds. Chrome's disk and media caches are placed on tmpfs. perf
is given a jiffy, after kill
is signal, to store it's output which is then inserted into database.
#!/bin/bash
[[ $2 == 1 ]] && rm -rf /tmp/tmpffffs/Default
perf stat -o perfout -x$' ' -e cycles,instructions,task-clock google-chrome --media-cache-dir=/mnt/tmpffffs --disk-cache-dir=/mnt/tmpffffs "$1" &
ppid=$!
while [[ -z $pid ]] ; do pid=$(ps --ppid "$ppid" -o pid:1 --no-headers); done
sleep 10
kill "$pid"
sleep 0.5
id=$(mysql -B stats -e "INSERT perf (site, is_clean_cache, is_ghostery) VALUES ('$1','$2','$3');SELECT LAST_INSERT_ID();"|grep -Eo "^[0-9]+$")
cat perfout|awk -v id="$id" '$1 ~ /^[0-9]+/ {gsub("-","_", $2) ; print "UPDATE perf SET " $2 "=" $1 " WHERE id=" id ";"}'|mysql stats
The perf stat
report will also include browser startup so I needed a baseline for the tests. I first thought about:blank
would be the ideal "nothing" to render but intrestingly opening about:blank
consumed roughly 8% more resources than a short html file vOv. So the minimum is set by file:///mnt/tmpffffs/derp.html
with following content:
<!DOCTYPE html><html><head><title>derp</title></head><body><p>derp</p></body>
Two other variables on the test runs are c
for busting the cache and g
for Ghostery
enabled. Columns prefixed with Δ
is compared to localfile baseline. eg. Gizmodo with warm cache and Ghostery has average of 20.55 Gins of which 12.69 Gins was caused by the site itself (20.55 Gins - baseline 7.86 Gins).
site | c | g | G_cycles | G_ins | s_tclock | ΔG_Cycles | ΔG_ins | Δs_tclock |
---|---|---|---|---|---|---|---|---|
/mnt/tmpffffs/derp.html | 0 | 0 | 4.59 | 6.39 | 1.16 | 0.00 | 0.00 | 0.00 |
/mnt/tmpffffs/derp.html | 1 | 1 | 5.91 | 7.99 | 1.56 | 0.00 | 0.00 | 0.00 |
/mnt/tmpffffs/derp.html | 0 | 1 | 5.70 | 7.86 | 1.51 | 0.00 | 0.00 | 0.00 |
/mnt/tmpffffs/derp.html | 1 | 0 | 4.45 | 6.19 | 1.17 | 0.00 | 0.00 | 0.00 |
This blog | 0 | 0 | 5.16 | 6.91 | 1.32 | 0.57 | 0.52 | 0.16 |
This blog | 1 | 1 | 6.82 | 8.94 | 1.93 | 0.91 | 0.95 | 0.37 |
This blog | 1 | 0 | 5.36 | 7.18 | 1.56 | 0.91 | 0.99 | 0.39 |
This blog | 0 | 1 | 6.66 | 8.63 | 1.75 | 0.96 | 0.77 | 0.24 |
The European Parliament | 0 | 0 | 8.83 | 11.11 | 2.83 | 4.24 | 4.72 | 1.67 |
The European Parliament | 1 | 0 | 9.28 | 11.58 | 4.01 | 4.83 | 5.39 | 2.84 |
The European Parliament | 0 | 1 | 11.50 | 13.68 | 4.92 | 5.80 | 5.82 | 3.41 |
The European Parliament | 1 | 1 | 11.86 | 14.08 | 5.80 | 5.95 | 6.09 | 4.24 |
The New York Times | 0 | 1 | 12.65 | 16.68 | 4.08 | 6.95 | 8.82 | 2.57 |
The New York Times | 1 | 1 | 13.04 | 17.04 | 4.49 | 7.13 | 9.05 | 2.93 |
Gizmodo | 0 | 1 | 17.14 | 20.55 | 10.03 | 11.44 | 12.69 | 8.52 |
The New York Times | 0 | 0 | 16.18 | 20.59 | 5.73 | 11.59 | 14.20 | 4.57 |
Gizmodo | 1 | 1 | 17.70 | 21.00 | 10.06 | 11.79 | 13.01 | 8.50 |
The New York Times | 1 | 0 | 17.89 | 22.40 | 6.77 | 13.44 | 16.21 | 5.60 |
Gizmodo | 0 | 0 | 20.70 | 24.25 | 10.59 | 16.11 | 17.86 | 9.43 |
Gizmodo | 1 | 0 | 21.40 | 24.84 | 10.88 | 16.95 | 18.65 | 9.71 |
- c = is cache cleaned before startup
- g = is Ghostery enabled
- G_cycles = Giga cpu cycles
- G_ins = Giga instructions
- s_tclock = seconds task-clock
Tested with google-chrome 54.0.2840.59, LightDM, 4.4.0-43 Kernel, Nvidia 361, enough ram to fs cache everything and i7-6700K.
Some findings
- Gizmodo fully utilizes a single core of CPU in each test. Does the task-clock surpass run time on a slower CPU?
- The European Parliament gets into a fight with Ghostery adding almost 10% CPU time.
- Blocking 3rd party resources on Gizmodo and NYTimes saves ~15% CPU time on top of the 50-70% savings in bandwidth and requests.
- Ghostery overhead ~ 1.5 Gins.
- Bloat sites consume the internet, your cpu, your cache, your time (and battery if applicable).
- Page weight and CPU time go pretty much hand-in-hand.
So, what could be done? (and should be done)
Fix 1 : Markup on a diet
To illustrate the problem I took a sample from a list of <article>
s on Gizmodo. This one has 288 visible characters, 1 picture, 2 links to the main article and social media share widget :
<article class="postlist__item hentry js_post_item status-published post-item-frontpage" data-id="1788247483" id="post_1788247483" data-model="%7B%22id%22%3A1788247483%2C%22isBlip%22%3Afalse%2C%22authorBlogId%22%3A1635347849%2C%22authorId%22%3A%225876237249236402073%22%2C%22defaultBlogId%22%3A4%2C%22contentType%22%3A%22Media%22%2C%22sharedPostId%22%3Anull%2C%22parentId%22%3Anull%2C%22parentAuthorId%22%3Anull%2C%22starterId%22%3A1788247483%7D"><div class="meta--pe secondary-byline"></div><header><h1 class="headline entry-title js_entry-title">...
That's roughly ~10% of markup (first 548 of 5274 chars) and the article ID 1788247483
is already flung around 4 times. In addition of being repetitive it's mostly of no value for the user. After putting the <article>
through a diet of low data-attributes
and reduced class
es - which of course breaks most of the functionality - we're left with fifth of the original (1003 chars). A savvy javascripter should be able to glue on all the widgets and events based on the single data-id
attribute on the <article>
entity.
<article data-id="1788247483"><header><h1><a href="/dont-trust-the-live-space-videos-you-see-on-facebook-1788247483">Don't Trust the 'Live' Space Videos You See on Facebook</a></h1><div class="author"><a href="//kinja.com/sophiekleeman">Sophie Kleeman</a></div><time datetime="2016-10-26T14:41:00-04:00"><a href="/dont-trust-the-live-space-videos-you-see-on-facebook-1788247483" title="10/26/16 2:41pm">Today 2:41pm</a></time><a class="replies" href="/dont-trust-the-live-space-videos-you-see-on-facebook-1788247483#replies">8</a></header><div><figure><a href="/dont-trust-the-live-space-videos-you-see-on-facebook-1788247483"><img src="//i.kinja-img.com/gawker-media/image/upload/s--1L9S9ad0--/c_fill,fl_progressive,g_center,h_180,q_80,w_320/euupntksbkomknodtsqv.png"/></a></figure><p>The Earth isn’t flat. The moon landing wasn’t bogus. And that “live” video of the International Space Station you might have seen floating around this morning on Facebook certainly wasn’t real.</p></article>
Fix 2 : CSS on a diet
Chrome's developer tools' Audits is a handy tool for roasting out the unused CSS. An short audit tour resulted in
Site | Remove unused CSS rules verdict |
---|---|
Medium.com - 8 Things Every Person Should Do Before 8 A.M. | 3031 rules (84%) of CSS not used by the current page. |
TechCrunch | 2438 rules (77%) of CSS not used by the current page. |
Gizmodo | 3838 rules (82%) of CSS not used by the current page. |
Yahoo! | 3997 rules (80%) of CSS not used by the current page. |
Hacker News | 47 rules (47%) of CSS not used by the current page. |
(╯°□°)╯︵ ┻━┻ The audit tool must be broken or the results are just terrible. Yuge amount cpu cycles wasted on parsing the rules and trying them on except for Hacker News. Few ways to approach the problem
- framework should build CSS files based on application's usage of the rules available or
- remove unused rules afterwards with other tools
- splitting the rules between common and those applied only on certain types of pages. (eg. Gizmodo offers the same CSS on every page, NYTimes has a separate CSS for articles and front page.)
- just write your own CSS and exhaust yourself before the 1000th rule :)
Fix 3 : Javascript on a diet by example
The omnipresent Google analytics.js
is 27,805 characters at the time of writing. The common low-traffic site's installation-copy-paste-snippet before the closing </head>
really doesn't need 90% of what's included for the single onload pageView
event. Same sort of results could be achieved in ~500 characters. (no crossbrowser nor cookie stuff included)
<script>
document.addEventListener("DOMContentLoaded", function () {
var xhr = new XMLHttpRequest()
xhr.open("PUT", "https://ospi.netcode.fi/tracking-dongle.php")
xhr.send(JSON.stringify({ "site-key" : "UA-88888888", "random" : Math.round(Math.random() * 1000000), "a": navigator.appCodeName, "b":navigator.appVersion, "c" :navigator.userAgent,"d":navigator.platform,"e":navigator.language,"f": window.innerHeight,"g":window.innerWidth, "h": location.hostname, "i": location.pathname, "j" : document.referrer }))
})
</script>
I'm not proficient with JavaScript, but I hope you get the idea :-] If one size fits all it sure looks saggy on most. And the same goes for most of the glue-on JS frameworks eg. jQuery. Similar tools should be applied on shedding the fat off JS as in CSS case above.
Conclusion
As a former regular reader of many tech sites like Gizmodo, ArsTechnica, TechCrunch, Tom's Hardware etc. I nowadays just head elsewhere as they lost their way at some point. Snappy light-weight sites, even if they're not "mobile-friendly", are usually more usable on mobile device than their big brothers with @media queries and sandwiches menus. Also making the clients burst like fireworks across the internet as the 3rd party stuff is loading is also very troublesome on privacy.
The web is awesome and accessible from (almost) anywhere but the blatant neglect of site performance bogs it down. Probably everyone has a theory or some even a personal experience of how a site gathers bloat along the years. How log analysis and per-site bought banners transformed into 50 ad-tech / tracking widgets. Please, do share a story if you got one.