Why we made quickcache

One of the great things about Python is how easy it is to hit the ground running. The standard library is vast, and for every common problem people have, someone has written and published a library that you can download and install with pip
. Often there is one right way to do things. Want to make HTTP requests? pip install requests
. Want a database adapter, ORM, and migration system (but not using Django or some other integrated framework)? Use SQLAlchemy
and alembic
. And so on…
But when it comes to even the simplest kind of caching, I see code like this everywhere:
def get_all_posts(person): cache_key = 'get_all_posts:{}'.format(person.id) posts = cache.get(cache_key, default=None) if posts is None: # do the expensive operation posts = Post.objects.filter(person=person).all() cache.set(cache_key, posts) return posts
What’s going on here? We’ve got five lines devoted to the grunt work of caching and one line devoted to the actual thing we want to do. How could we improve this?
If you really inspect those lines, there are three things you need to know:
1. The function I want to cache is:
def get_all_posts(person):
return Post.objects.filter(person=person).all()
2. The cache backend I want to use to do it is: cache
3. The value of the function only changes when person.id
changes
Enter quickcache
Two years ago, I wrote quickcache so that I could cache functions the way I wanted to. With quickcache, the above looks like this:
from quickcache import get_quickcache quickcache = get_quickcache(cache=cache) @quickcache(['person.id']) def get_all_posts(person): return Post.objects.filter(person=person).all()
The first parameter to quickcache
is a list of names of arguments to vary on, and you can use intuitive.
–notation to access those arguments’ properties as well.
Now, I should point out, I wrote this with Django’s cache
library in mind, but you can use any backend as long as it is wrapped to present a very simple interface:
# get the value for key from the cache, return default if it's missing cache.get(key, default=None) # set the value for key to value cache.set(key, value) # remove key from the cache cache.delete(key)
Tiered Caching
There’s often a tradeoff between caching in process memory and caching in a shared cache like memcached or redis. If you store it in process memory, then it is blazing fast to retrieve again in the same process, but on a multi-worker web service, for example, other forked processes would not get the same benefit. If you store it in a shared cache, then other processes can benefit from it, but if you access it multiple times in short succession, for example within the same web request, then each time requires a round trip to memcached/redis, which adds up in a way that caching in local memory doesn’t.
Why not use both?
At Dimagi, that is what we do, and quickcache
supports a special configuration that makes this easy with Django out of the box, and some simple tools you can use to replicate it with your own cache backend as well.
quickcache
comes with the concept of a TieredCache—simply a cache that combines two or more caches and outsources theget
s, set
s, and delete
s to them. On a get
it’ll try the first, then the second. On a set
it’ll set in both. And on a delete
it’ll delete from both. If you’re not using Django, you can use this helper to configure it using your own cache backends.
If you are using Django, it’s even easier. For a cache that stores in the shared cache for five minutes and in local memory for 10 seconds, you could use the following:
from quickcache.django_quickcache import get_django_quickcache
quickcache = get_django_quickcache(memoize_timeout=10, timeout=5 * 60)
Sometimes you want to skip the cache
At Dimagi, we also find ourselves writing code like this:
def get_all_posts(person, force=False):
cache_key = 'get_all_posts:{}'.format(person.id)
if force:
posts = None
else:
posts = cache.get(cache_key, default=None)
if posts is None:
# do the expensive operation
posts = Post.objects.filter(person=person).all()
cache.set(cache_key, posts)
return posts
That way, get_all_posts
was cached by default, but you could also force it to skip the cache, after which it would update the cache.
This also comes out of the box with quickcache
,, whether you use the generic or Django variants. To get exactly the same behavior, you can use:
@quickcache(['person.id'], skip_arg='force') def get_all_posts(person, force=False): return Post.objects.filter(person=person).all()
Easily configurable at every level
Of course, each time you use quickcache, you may want to use it with different defaults.
Each argument to get_quickcache
or get_django_quickcache
can be passed in at one of three stages, later values overriding defaults. Usually, you’ll define a singleton quickcache
that you use everywhere. For example,
# singletons/quickcache.py
quickcache = get_django_quickcache(memoize_timeout=10, timeout=5 * 60)
Then you don’t have to define quickcache every time. Just import it from this file. When you do use it, you can override the defaults. For example, here I override timeout and set skip_arg for the first time:
@quickcache([], timeout=60 * 60, skip_arg='force')
def ...
You can also bake in extra args at any time between when you call get{_django}_quickcache
and when you use it. For example, if you’re going to be using skip_arg='force'
repeatedly in a file, you can inherit the defaults from your main quickcache
object, and change just skip_arg
:
from singletons.quickcache import quickcache
skippable_quickcache = quickcache.but_with(skip_arg='force')
Then the previous example could just be
@skippable_quickcache([], timeout=60 * 60) def ...
For either flavor, the arguments you can set are vary_on
, skip_arg
, and two advanced args that let you mess with the internals, helper_class
(if you’re really dying to override some quickcache
internals), and assert_function
(if you want to fine-grained control over the way certain warnings are logged).
For get_quickcache
only, you can also set the cache
argument. For get_django_quickcache
only, you can set the memoize_timeout
and timeout
arguments.
Custom vary_on
and skip_arg
Using string values for skip_arg
and a list of strings for vary_on
works nine out of 10 times, but when you really want control, you can pass a function to either. The function should have the same arguments as the function you’re decorating. Your skip_arg
function should return the value to vary on; your skip_arg function should return True
when you want to skip the cache.
Clearing the cache
A caching utility becomes nearly useless if it doesn’t let you easily clear a particular cache value. quickcache
has a nice interface for doing this as well.
# person writes a new post
Post(person=person).save()
# now the cached value for get_all_posts(person) is out of date
# so clear it:
get_all_posts.clear(person)
# now it'll return a fresh value next time you call it
all_posts = get_all_posts(person)
Conclusion
For us at Dimagi, quickcache
has been a dream come true. We use it all the time. It is hard to imagine what building web applications in Python would be like without it; I’ve largely blocked those memories out. It’s so nice to use, simple, and powerful that it feels wrong to keep it all to ourselves.
That’s why we’ve decided to spend a little extra time to isolate the Django dependencies, eliminate any dependency on the CommCare HQ codebase, and publish it on pypi.
Sometimes there’s a right way to do it in Python. Want to cache values at the function level? pip install quickcache
.
Check out the code at https://github.com/dimagi/quickcache.
Share
Tags
Similar Articles
Another day, another Zero Day: What all Digital Development organizations should take away from recent IT security news
Even if you don’t work as a software developer, you probably heard about recent, high profile security issues that had IT Admins and developers frantically patching servers over the holidays and again more recently. Dimagi's CTO shares what these recent issues mean for Digital Development organizations.
Technology
January 28, 2022
Join the fight to support critical open source infrastructure
Open Source tools are a critical piece of global infrastructure, and need champions for long term investment
Technology
March 17, 2020
Two big lessons that Iowa and Geneva can teach us about technology in digital development
Last week brought two high profile technology failures into the global spotlight. Although these two mishaps may seem quite different at first glance, they both highlight challenges that are inherent in providing software in the public sector (regardless of locale) and illustrate cautionary lessons worth discussing for practitioners in Digital Development. The Iowa Caucus Debacle
Technology
February 7, 2020