Every so often the need arises to create a thread safe cache solution. This is my stab at a simple yet fully functional implementation that maintains the essential dictionary semantics, is thread safe and has a fixed, configurable size, for example in a multithreaded http server like CherryPy.
Although many dictionary operation like getting an item are reported to be atomic and therefore thread safe, this is actually an implementation specific feature of the widely used CPython implementation. And even so, adding keys or iterating over the keys in the dictionary might not be thread safe at all. We must therefore use some sort of locking mechanism to ensure no two threads try to modify the cache at the same time. (For more information check this discussion.)
The Cache
class shown here features a configurable size and if the number of entries is too big it removes the oldest entry. We do not have to maintain a explicit usage administration for that because we make use of the properties of the OrderedDict
class which remembers the order in which keys are inserted and sports a popitem()
method that will remove the first (or last) item inserted.
from collections import OrderedDict from threading import Lock class Cache: def __init__(self,size=100): if int(size)<1 : raise AttributeError('size < 1 or not a number') self.size = size self.dict = OrderedDict() self.lock = Lock() def __getitem__(self,key): with self.lock: return self.dict[key] def __setitem__(self,key,value): with self.lock: while len(self.dict) >= self.size: self.dict.popitem(last=False) self.dict[key]=value def __delitem__(self,key): with self.lock: del self.dict[key]
Due to the functionality of the OrderedDict
class we use, the implementation is very concise. The __init__()
method merely checks whether the size
attribute makes any sense and creates an instance of an OrderedDict
and a Lock
.
The with
statements used in the remaining methods wait for the acquisition of the lock and guarantee that the lock is released even if an exception is raised. The __getitem__()
method merely tries to retrieve a value by trying the key on the ordered dictionary after acquiring a lock.
The __setitem__()
method removes as many items within its while
loop to reduce the size to below the preset amount and then adds the new value. The popitem()
method of an OrderedDict
removes the least recently added key/value pair if it's last
argument is set to False
.
The __delitem__()
also merely passes on the control to the underlying dictionary. Together these methods allow for any instance of our Cache
class to be used like any other dictionary as the example code below illustrates:
>>> from cache import Cache >>> c=Cache(size=3) >>> c['key1']="one" 0 >>> c['key2']="two" 1 >>> c['key3']="three" 2 >>> c['key4']="four" 3 >>> c['key4'] 'four' >>> c['key1'] Traceback (most recent call last): File "", line 1, in File "cache.py", line 13, in __getitem__ return self.dict[key] KeyError: 'key1'
Of course this doesn't show off the thread safety but it does show that the semantics are pretty much like that of a regular dictionary. If needed this class even be extended with suitable iterators/view like keys()
and items()
but for most caches this probably isn't necessary.