Sunday 20 February 2011

Checking an ISBN

International Standard Book Numbers consist of a string of digits followed by a checksum digit to help detect typos, not unlike schemes used in credit card numbers. Here I show a small Python function to calculate this check digit.

You can pass it a string consisting of 9 or 10 digits. If you pass in a 10 digit number the last digit is ignored. The line returned consists of the first 9 digits plus a final digit that is the checksum (this digit might be an X character).

The most common use case is to check whether a given isbn is well formed by passing it to isbnchecksum() an comparing the result to the original:

if myisbn == isbnchecksum(myisbn):
    ... proceed ...

For the last 5 or 6 years 10 digit isbn has been replaced by 13 digit isbn starting with a 978 or 979 sequence. This 13 digit sequence in compatible to so called ean numbers used for marking all kinds of goods, not just books and the first 3 digits represent a country code. For books this country is the fictional Bookland. The checksum for these isbn-13 codes is calculated differently and not shown here but the algorithm can be found on Wikipedia together with additional information on isbn. The small snippet shown below is part of a larger module that I wrote to retrieve information from sources like the Library of Congress and Amazon. That module can be found on homepage.

import string

def isbnchecksum(line):
    """
    Calculate the checksum for an isbn-10 number.
    """"
    if (len(line) == 10):
        line = line[0:9]
    if (len(line) != 9):
        raise AttributeError('ISBN should be 9 digits, excluding checksum!')
    sum = 0
    count = 0
    for ix in line:
        sum = sum + (10 - count) * string.atoi(ix)
        count = count + 1
    sum = sum % 11
    if (sum != 0):
        sum = 11 - sum
    if (sum == 10):
        line = line + 'X'
    else:
        line = line + string.digits[sum]
    return line

After I published this I realized the solution wasn't all that Pythonic. A more elegant implementation (with slightly different semantics) might be the following bit of code (although some people argue that the ternary x if c else y operator is not Pythonic in any way. Note that the we used the atoi() function from the locale module for even more portability.

from string import digits
from locale import atoi

def isbn10checksum(isbn):
    if len(isbn)!=10 : raise AttributeError('isbn should be 10 digits')
    return 0 == sum((10-w)*atoi('10' if d == 'X' else d)
                    for w,d in enumerate(isbn.upper()))%11

Just for completeness sake the code for an isbn-13 check. It not very elegant but it is a nice example of Python strides in action:

def isbn13checksum(isbn):
    if len(isbn)!=13 : raise AttributeError('isbn should be 13 digits')
    c=(10-(sum(atoi(d) for d in isbn[0:12:2])
                 +sum(3*atoi(d) for d in isbn[1:12:2]))%10)%10
    return atoi(isbn[12]) == c

No comments:

Post a Comment