It seems like every time I look back at our build pipeline our builds are taking longer and longer to go from commit to
being out in the wild. A lot of that time is spent running our test suite and static code analysis tools, which can't be
reduced much further. Those pipelines aren't getting much shorter as it's not really in our best interest to tell our
developers to stop writing new tests. So the savings have to come from somewhere else. Luckily, there's still a very
large portion of time that is spent getting our code ready and onto all the production boxes. So let's have a bit of a
dig to try and see if we can cut some time off our deployment process.
The most important part of benchmarking, and determining if you've made any impact at all, is having a good initial
benchmark. So I trawled through our build history analysing time stamps to find a typical deployment. Below is an
extract of the relevant log lines with timestamps.
Straight away, a couple of the steps stick out. Compiling the assets, and uploading them take about 75% of the total
time. It is pretty ridiculous to consider that these steps always run on deploy, even when there are no asset changes
between the currently live version and version in the process of deploying.
If you didn't know, Rails' asset precompile is supposed to be almost instant when assets are unchanged. So then why is
it taking so long for our builds regardless? It turns out that Sprockets leverages a cache directory holding partial
asset artifacts which can be instantly compiled into production assets. Whenever the original assets change, these
artifacts are invalidated and as a result the build process has to start from scratch.
Due to the nature of some our assets, and our slightly unreliable CI/CD cache, we don't carry our Sprockets cache
between builds. In the future we should come back and reevaluate this, potentially utilising S3 to carry zipped asset
artifacts between builds of the same branch. But currently, each build has to compile the assets from scratch, which
leads to the extensive build times seen above.
So, given that the source assets aren't changing every deploy, what can we do in these deploys to prevent building
identical production assets? Surely we can just reuse the assets that were compiled last build? To answer that, we'll
need to dig into how Rails determines which minified asset to serve and where we store our assets.
When Rails completes the asset precompile, it appends a fingerprint to each compiled asset. For example,
application.js might be compiled to application-7d25452ceb63594739af24cde73b6499.js where the fingerprint section
(7d..99) is the hash of the content of the file. This just means that if you run the precompile twice without changing
anything, the exact same file will be generated, with the exact same file name. Not only does that help with
cache busting, but it also means we can keep a historical set
of unique assets in the event that we need to rollback a release.
Alongside these files, Rails also generates a manifest file. When application is running, this manifest file tells Rails
that application.js can be found at application-7d25452ceb63594739af24cde73b6499.js. Effectively, it's just a
mapping between original file names, and file names with fingerprints. Similar to the fingerprinted file names, running
the asset precompile twice with the same assets will generate the exact same manifest file.
In the breakdown above, there was a step titled ‘Upload Assets'. This step is where we take the compiled, fingerprinted
assets and upload them into an AWS S3 bucket. As the asset's filenames include fingerprints, we know that if the file
exists on S3, the contents will identical, so we don't need to re-upload it. This saves us some time during the deploy,
but more importantly, if the assets haven't changed since the last deploy, then we know that our compiled assets are
already on S3.
Tanda's web instances don't have a copy of the compiled assets on them, just the asset manifest containing the mapping
between the original file name and fingerprinted file name. This, coupled with some Rails configuration, tells our Rails
instances that application.js can actually be found at
Putting all of this together, we know that if the assets are unchanged and we can get a copy of the previous manifest,
we don't need to rerun the precompile or upload anything to S3. We can just drop the old manifest onto the boxes as part
of the deploy, and voilà, we have access to the correct, compiled, production assets. All currently running Rails
instances have the current manifest (or else they themselves wouldn't be able to access assets), so we can get a copy of
it from them.
The harder task is determining if the assets have changed without running the precompile. The fingerprint added by
Sprockets is a hash of the contents of the compiled file, so we can't compute this hash to unless we run the intensive
task of compiling the assets. But we need the hash to determine if the asset is unchanged so we can decide to skip the
precompile. Catch-22. Let's look at another approach.
We know that when the original assets change, so too do some (or all) of the compiled assets. So what if we
'fingerprint' the uncompiled assets and compare it to the ‘fingerprint' of the currently deployed assets. That way, if
the fingerprints match, we can skip the precompile and the upload, using the currently deployed manifest instead.
So how the heck do we generate this 'fingerprint'? A
quick Google search
revealed that creating a hash for a whole directory isn't that hard, and thankfully Rails' directory structure dictates
that all raw assets live in app/assets. So the 'fingerprint' for our assets can be computed as the fingerprint for the
whole app/assets directory.
First, I grabbed the full list of assets using find, then I piped each of them into md5sum, giving me a hash for
each file. Finally, I piped all those hashes back into md5sum giving me a hash of all the hashes. If a file changes,
its hash will change, which will also mean the hash of hashes will change. This final hash is exactly what we were
looking for, a 'fingerprint' for the assets directory.
I modified the build process, running the new 'fingerprinting' command on the new app/assets directory being deployed
and on the currently deployed app/assets directory. In the case they matched, I downloaded the manifest file from a
currently deployed box. A few empty commits and a bit of debugging later, I got this:
If you didn't do some quick mental math from the timestamps above, the breakdown below should reveal it all. In short,
the build process went from 8:43 to 2:46 when there were no new assets. Thats a speedup of 3.15x! I don't know about
you, but I'd call scraping 6 whole minutes off the build process is a good day's work.
There are a lot more improvements that could be made to this process in the future. As an aside, at Tanda we use Webpack
for all of our new assets, and Sprockets for all existing assets. This process could be improved by creating a seperate
'fingerprint', like we did for all the original assets, for Sprockets and Webpack separately. The Sprockets precompile
takes about 3 minutes, whereas the Webpack precompile takes about 7 seconds. If we could correctly split assets and
compute these fingerprints, changing the Webpack assets wouldn't trigger the Sprockets precompile which would be another
All in all, the takeaway from this exercise is this: don't be afraid to dig a little deeper into production critical
processes. Definitely get your code heavily reviewed and tested before attempting to use it in production, but don't
hold back thinking that the process will fix or improve itself. If a part of your development cycle isn't living up to
your expectations, and it hasn't been fixed already, then either no one else has noticed, or they're too scared to touch
it. So heed the wise words of Master Yoda and don't fear digging deeper.