Saturday 1 June 2024

Understanding descriptors in Python

 Background

As you know, Python does not have a concept of private variables or getters/setters. In one of the previous posts, we saw the use of property to achieve something similar. In this post, we will examine the concept underlying the property functionality, called descriptors. Using the descriptors is the pythonic way to handle attributes of a class. Descriptors are the mechanisms behind properties, methods, static methods, class methods, and super().


Understanding descriptors

Before we see any let's take an example to see why we need descriptors. Consider a simple Employee class below:

class Employee:
    def __init__(self, name):
        self.name = name


emp = Employee("Aniket")
print(emp.name)


For simplicity, it just has one instance variable called "name". In real life you would like to have some check when you set the name of Employee to something - like you might want to ensure it at least has one character, it can have a maximum of 10 chars, etc. We saw how to do this via properties in the last post, in this post we will see how to use descriptors which are the basis of property as well.

class Name:
class Name:
    def __set__(self, instance, value):
        print("Invoking __set__ on Name")
        if not isinstance(instance, Employee):
            raise ValueError("Name descriptor is to be used with Employee instance only")
        if len(value) < 1 or len(value) > 10:
            raise ValueError("Name cannot be less than 1 char or more than 10 char")
        instance._name = value

    def __get__(self, instance, owner):
        print("Invoking __get__ on Name")
        if not isinstance(instance, Employee):
            raise ValueError("Name descriptor is to be used with Employee instance only")
        return instance._name


class Employee:
    name = Name()

    def __init__(self, name):
        self._name = None
        self.name = name


emp = Employee("")
print(emp.name)


emp = Employee("Aniket")
print(emp.name)
emp.name = "Abhijit"
print(emp.name)


The above code defines a descriptor called "Name" and uses it to manage attributes for the Employee class instead. Above prints:

Invoking __set__ on Name
Invoking __get__ on Name
Aniket
Invoking __set__ on Name
Invoking __get__ on Name
Abhijit

You can play around passing names in the constructor as 
  • "Aniket" - Works fine. print Aniket
  • "" - Fails & prints ValueError: Name cannot be less than 1 char or more than 10 char
  • "Aniket Thakur" & prints ValueError: Name cannot be less than 1 char or more than 10 char

See how we now have more granular control over the Name attribute of the Employee class. That's the power of descriptors. 

Notice how emp.name = "Abhijit" works. Normally it would have set name attribute of Employee class to a string "Abhijit" but since in this case it is a descriptor it called __set__ dunder / magic method of the corresponding descriptor class.

Descriptor protocol

A class will be descriptor if it has one of the following methods:
  • __get__(self, obj, type=None) -> value

  • __set__(self, obj, value) -> None

  • __delete__(self, obj) -> None




Define any of these methods and an object is considered a descriptor and can override default behavior upon being looked up as an attribute.

  • If an object defines __set__() or __delete__() , it is considered a data descriptor
  • Descriptors that only define __get__() are called non-data descriptors

NOTEData descriptors always override instance dictionaries.

The example we saw above was of a data descriptor. Consider following example

class Name:

    def __get__(self, instance, owner):
        print("Invoking __get__ on Name")
        if not isinstance(instance, Employee):
            raise ValueError("Name descriptor is to be used with Employee instance only")
        return instance._name


class Employee:
    name = Name()

    def __init__(self, name):
        self._name = name


emp = Employee("Aniket")
print(emp.name)
emp.name = "Abhijit"
print(emp.name)


Here we do not have a __set__ method and consequently, it is not a data descriptor hence we can override it with instance dictionaries. Above prints
Invoking __get__ on Name
Aniket
Abhijit
Notice how it involved __get__ exactly once and when we set it to "Abhijit" it actually replaced the name from a data descriptor instance to a normal string stored in the instance dictionary.


One last thing is that as you see above a class is data descriptor if it has __set__ or __delete__ so if you do not have __set__ but just have delete then it is still a data descriptor and you cannot override in the instance dictionary.

class Name:

    def __get__(self, instance, owner):
        print("Invoking __get__ on Name")
        if not isinstance(instance, Employee):
            raise ValueError("Name descriptor is to be used with Employee instance only")
        return instance._name

    def __delete__(self, instance):
        del instance._name


class Employee:
    name = Name()

    def __init__(self, name):
        self._name = name


emp = Employee("Aniket")
print(emp.name)
emp.name = "Abhijit"
print(emp.name)

The above code will fail by printing
Invoking __get__ on Name
Aniket
Traceback (most recent call last):
  File "/Users/aniketthakur/PycharmProjects/HelloWorld/descriptors.py", line 29, in <module>
    emp.name = "Abhijit"
AttributeError: __set__

and that is because it could not find a __set_ method.

Related Links

t> UA-39527780-1 back to top