Every so often the need arises to create a thread safe cache solution. This is my stab at a simple yet fully functional implementation that maintains the essential dictionary semantics, is thread safe and has a fixed, configurable size, for example in a multithreaded http server like CherryPy.
Although many dictionary operation like getting an item are reported to be atomic and therefore thread safe, this is actually an implementation specific feature of the widely used CPython implementation. And even so, adding keys or iterating over the keys in the dictionary might not be thread safe at all. We must therefore use some sort of locking mechanism to ensure no two threads try to modify the cache at the same time. (For more information check this discussion.)
Cache class shown here features a configurable size and if the number of entries is too big it removes the oldest entry. We do not have to maintain a explicit usage administration for that because we make use of the properties of the
OrderedDict class which remembers the order in which keys are inserted and sports a
popitem() method that will remove the first (or last) item inserted.
from collections import OrderedDict from threading import Lock class Cache: def __init__(self,size=100): if int(size)<1 : raise AttributeError('size < 1 or not a number') self.size = size self.dict = OrderedDict() self.lock = Lock() def __getitem__(self,key): with self.lock: return self.dict[key] def __setitem__(self,key,value): with self.lock: while len(self.dict) >= self.size: self.dict.popitem(last=False) self.dict[key]=value def __delitem__(self,key): with self.lock: del self.dict[key]
Due to the functionality of the
OrderedDict class we use, the implementation is very concise. The
__init__() method merely checks whether the
size attribute makes any sense and creates an instance of an
OrderedDict and a
with statements used in the remaining methods wait for the acquisition of the lock and guarantee that the lock is released even if an exception is raised. The
__getitem__() method merely tries to retrieve a value by trying the key on the ordered dictionary after acquiring a lock.
__setitem__() method removes as many items within its
while loop to reduce the size to below the preset amount and then adds the new value. The
popitem() method of an
OrderedDict removes the least recently added key/value pair if it's
last argument is set to
__delitem__() also merely passes on the control to the underlying dictionary. Together these methods allow for any instance of our
Cache class to be used like any other dictionary as the example code below illustrates:
>>> from cache import Cache >>> c=Cache(size=3) >>> c['key1']="one" 0 >>> c['key2']="two" 1 >>> c['key3']="three" 2 >>> c['key4']="four" 3 >>> c['key4'] 'four' >>> c['key1'] Traceback (most recent call last): File "
", line 1, in File "cache.py", line 13, in __getitem__ return self.dict[key] KeyError: 'key1'
Of course this doesn't show off the thread safety but it does show that the semantics are pretty much like that of a regular dictionary. If needed this class even be extended with suitable iterators/view like
items() but for most caches this probably isn't necessary.