Sunday, 17 July 2011

A SQLite multiprocessing proxy, part 3

In a previous article I presented a first implementation of a SQLite proxy that makes it possible to distribute the workload of multiple processes with the use of Python's multiprocessing module. In this third part of the series we try to analyze the performance of this setup.

High workload example

In our sample implementation we can vary the workload inside the processes that interact with the SQLite database by varying the size of the table that we query. A table with many rows takes more time to scan for a certain random value than a table with just a few rows.

The first graph we present here is about high workload: the table that we query is initialized with one million records. The table shows the time to complete 100 queries. The test was done on a machine with 6 processor cores and in the graph we show the results for 2 (deep purple, back) and 6 (light purple, front) worker processes and a varying number of threads.

The results are more or less what we expect: more worker processes means that the time to complete all tasks is reduced. However the number of threads is also significant. If the number of threads is less than the number of available worker process we do not reach the full potential. Basically we need at least as many threads a there are worker processes to keep those processes busy. If we have more threads than worker processes there is no more gain, in fact we see a minute increase in the time needed to complete all tasks. This might be due to the overhead of creating and managing threads in Python.

Low workload example

If we initialize our table with just a single row the workload will be negligible. If we draw a similar graph as for the high workload we see a completely different picture.

Now we see hardly any difference between 2 work processes or 6 and increasing the number of threads also has no effect. Also the data is rather noisy, i.e. varies quite a bit in a non-uniform manner, especially for the case with 2 worker processes. The reason for this behavior is not entirely clear to me, although it is obvious that because of the very small workload the time to setup communication with the worker process is a significant factor here.

1 comment: