Thursday, 11 October 2007

Reflection: hasattr

This is my first of, I hope, many articles about reflection in Python. In general, we are speaking of reflection when the program has the ability to "observe" and possibly to modify its own structure and behaviour (from Wikipedia). Scripting languages often offer many ways the program code can be observed and modified. Reflection gives the programmer a very powerful tool, but it doesn't come without a price. If reflection is used in a careless way, it can lead to creation of code that is unreadable and unmanageable. It quickly turns out that adding a feature or fixing a bug in such a code requires adding more and more "bad" reflection. In my articles, I'll try to summarise positive and negative impacts of various reflection uses in Python.

Let's start from the most basic reflection function of Python that almost everyone uses: the hasattr() built-in. The most common usage of this function I encounter is something like:

if hasattr(obj, "hello_world"):
obj.hello_world()
else:
# do nothing or throw exception

From the readability's point of view, this is a perfectly valid piece of code. However, if you look at Python's source code, you'll notice that hasattr() is implemented in a peculiar way (code from Python-2.5.1\Python\bltinmodule.c):

static PyObject * builtin_hasattr(PyObject *self, PyObject *args) {
// [CUT]
v = PyObject_GetAttr(v, name);
if (v == NULL) {
// [CUT]
return Py_False;
}
// [CUT]
return Py_True;
}

Can you see it? hasattr() simply calls getattr() and checks if it didn't throw an exception! So the presented Python code could be written without using hasattr() at all:

try:
hello_world = obj.hello_world
except AttributeError:
# do nothing or throw exception
else:
hello_world()

The main benefit is that we don't have to use the string literal (the second parameter of hasattr()) anymore, which we always forget to change when we rename our methods.

The local variable looks a bit ugly, but you can get rid of it at the cost of running a second getattr() -- a small price for readability:

try:
obj.hello_world
except AttributeError:
# do nothing or throw exception
else:
obj.hello_world()


Some may ask: Why would I use a complicated try..except..else clause when I could simply write:

try:
obj.hello_world()
except AttributeError:
# do nothing or throw exception

Well, you could, if you are sure that hello_world() will never throw AttributeError, because when it does, you will not notice it -- your code will understand that obj doesn't have the hello_world attribute. Still, this is a good solution in many cases.

The thing I really dislike about hasattr() is that you can pass anything as a second argument. Even if you use a constant, things start to look ugly:

if hasattr(obj, HELLO_WORLD):
obj.hello_world()

Here, if you would like to change the name of hello_world method, it would be even harder for you to remember to also change the constant.

If you have to use constants, a slightly better idea would be:

if hasattr(obj, HELLO_WORLD):
getattr(obj, HELLO_WORLD)()

But then again, more elegant solution is:

try:
hello_world = getattr(obj, HELLO_WORLD)
except AttributeError:
pass
else:
hello_world()

or, if you are sure that inner AttributeError won't be thrown:

try:
getattr(obj, HELLO_WORLD)()
except AttributeError:
pass


But the most horrible examples I've seen look like this:

method_name = calculate_method_name_or_take_it_from_some_place()
if hasattr(obj, method_name):
getattr(obj, method_name)()

Apart from very few special cases when this kind of behaviour is acceptable, you should avoid it at all costs. I found myself reverting repository commits too many times after having removed a method I was sure no one used.

To summarise, I think that hasattr(), when passed a string literally written in code, is an example of "good" reflection, although it can be replaced with try..except clauses. Passing anything but a literal string or an easy-accessible constant to it should be usually considered improper and avoided or refactorised.

No comments: