11 Introduction to Object Oriented Programming in Python

Object oriented programming, or OOP, is a way to develop large(r) software projects. It allows programmers to break code up into smaller components that can be reused by recombining them to create new objects.

You may not need to do OOP yourself, but you will almost certainly need to understand what it is and how it works so you can understand other code you might need to use.

The dict in Python is an example of a “small component” that can be used to create a new container that has extra features.

There are 4 main “principles” to OOP:

Inheritance: Re-using functionality
Polymorphism: Adding functionality
Abstraction: Hiding the details from the user
Encapsulation: Combining methods and data

This chapter will provide a brief introduction to each of these.

Note

Even if you don’t drive or have a drivers license you still need to know the “rules of the road” so that you can walk around safely. For instance, you need to know what stop signs mean and how traffic lights work. OOP is like that: You may never use OOP yourself to write code, but others do, so you need to know how it works.

11.1 Inheritance

We can create our own custom dict as follows:

1class MyDict(dict):
2    pass

3d = MyDict()
4d.name = 'bob'
print(d.name)

1: Three things happen on this line: (1) we are defining a new “class” which is analogous to defining a function using def (2) this new class will be called MyDict which has the same role as the function name and (3) this class inherits the behavior of Python’s dict, which is declared here the same way arguments are declared in a function definition.
2: The keyword pass just tells Python that we are doing nothing in this code block. It’s just a place holder (for now).
3: Here we create an instance of our MyDict class and assign it to d. This is called “instantiation”. d is called an “object”, and it is an “instance” of MyDict. Also note that d is technically also a dict since MyDict inherits all the functionality from dict.
4: We add a custom attribute to our object, in this case name. This is not allowed on standard Python objects, although this rule is not part of OOP principles.

bob

The process shown above is called “sub-classing” since we create a subclass of dict. Our new subclass has all the features of dict plus the additional features we add. This can be represented schematically as:

flowchart TD
    A(Parent Class)
    B(Sub Class)
    C(Object)
    A -->|We inherit the functionality of the Parent Class| B
    B -->|We initialize our class| C

Subclass means to take an existing “class” then creating a new class based on it. This terminology was borrowed from the general concept of classification, like mammals are a subclass of the more general class of living things, humans are a subclass of mammal, and objects are like individual humans.

Table 11.1: Summary of the OOP jargon, some new and some we’ve seen before

Jargon	Description
`class`	A generic template that defines the properties of an object.
`parent`	The class which the subclass is based on. The subclass will have all the functionality of the parent class plus any extra stuff we add.
`subclass`	A template based on a parent class, which has functionality added to it.
`child`	Also known as a subclass.
`inheritance`	A subclass or child class will inherit all the functionality of the parent class.
`object`	An instance of a class.
`instance`	An object stored in a variable.
`instantiation`	The act of creating an instance of a class.
`attribute`	Some data that is attached to an object, like `d.name`
`method`	A function that is attached to an object, like `d.keys()`

Now that we know how to create our own custom dict, let’s make one that has a “helper” method that outputs useful info for us.

Example 11.1 (Add a new method to a dict) Create a subclass of dict that has a function which outputs the maximum and minimum values stored under all the keys.

Solution

import numpy as np


class MyDict(dict):

1    def info(self):
        mx = -np.inf
        mn = np.inf
2        for v in self.values():
            mx = max(mx, np.amax(v))
            mn = min(mn, np.amin(v))
        return (mn, mx)

d = MyDict()
d['arr1'] = np.arange(-10, 10)
d['arr2'] = np.arange(10, 20)
print(d.info())

1: self is the Python keyword meaning “this object”. In other words, if you do bob = MyDict() then internally self will refer to bob. All methods of subclasses must always accept self as the first argument. Python will always pass self to all methods in the first position, so we must always accept it, even if we don’t use it.
2: Here is a typical use of self and a classic example of inheritance: since self is a dict it has a keys() method, which we can access with self.keys(). This is also an example of Encapsulation, which is discussed further below, but essentially this method is operating on its own data.

(np.int64(-10), np.int64(19))

Comments In this case our inheritance structure looks like this:

flowchart TD
    A(Parent Class)
    B(Sub Class)
    C(Object)
    A -->|Inherit behavior of parent, then add some custom code| B
    B -->|Initialize our class| C

This is a contrived example, but it is actually pretty useful to add such a method to a custom dict. Note, however, that this only works if we somehow ensure that all the values in MyDict are numerical. We’ll see how to do this in Example 11.4.

The concept of self can be confusing at first, so it’s worth looking at it from a different angle. Let’s take the info() method we created in Example 11.1, and rewrite it as a normal function, which we are quite familiar with:

Example 11.2 (Convert a method to a normal function) Rewrite the info() method from Example 11.1 as a function which accepts a dictionary, but otherwise does exactly the same thing.

Solution

We can literally cut-and-paste the code from above, the modify it slightly:

1def info(d):
    mx = -np.inf
    mn = np.inf
    for v in d.values():  
        mx = max(mx, np.amax(v))
        mn = min(mn, np.amin(v))
    return (mn, mx)

1: The keyword argument is now d

And to use this function, we need to do:

data = dict()
data['arr1'] = np.arange(-10, 10)
data['arr2'] = np.arange(10, 20)
mn, mx = info(data)
print(mn, mx)

-10 19

Comments

We don’t even have to change the name of the keyword argument from self! The following will work perfectly:

def info(self):
    mx = -np.inf
    mn = np.inf
    for v in self.values():  
        mx = max(mx, np.amax(v))
        mn = min(mn, np.amin(v))
    return (mn, mx)

But we should change it since self has very specific meaning in Python, so other coders will be confused if/when they read our code. Keeping code readable and obeying the agree-upon conventions is a major part of programming Pythonically.

11.2 Polymorphism

Polymorphism means “to have many forms”. In the case of OOP, what this means is that methods of a subclass can have the same name as the parent class, but do something different. We can either “overwrite” the behavior of the parent class’s method(s), or “overload” them which means add extra behavior.

11.2.1 Method Overwriting

Method overwriting is pretty straight-forward to grasp, as shown in the following example.

Example 11.3 (Customize printing by overwriting the __str__() method) Create a subclass of dict that prints each key-value pair on a new line.

Solution

class MyDict3(dict):

    def __str__(self):
        s = []
        for k, v in self.items():
            s.append(str(k) + ': ' + str(v) + '\n')
        s = ''.join(s)
        return s


d = MyDict3()
d['item 1'] = 0
d['item 2'] = 'bob'
d['item 3'] = ['a', 'list', 'of', 'strings']
print(d)

item 1: 0
item 2: bob
item 3: ['a', 'list', 'of', 'strings']

Comments

As a reminder, here is how printing the dict normally looks:

d = dict()
d['item 1'] = 0
d['item 2'] = 'bob'
d['item 3'] = ['a', 'list', 'of', 'strings']
print(d)

{'item 1': 0, 'item 2': 'bob', 'item 3': ['a', 'list', 'of', 'strings']}

We have completely over-written the original behavior. Here is a flow diagram of the process:

flowchart TD
    A(Parent Class)
    B(Sub Class)
    C(Object)
    A -->|Inherit functionality from Parent, but overwrite __str__| B
    B --> C

Dunder methods

The methods which have double underscores on each side, like __<method>__ are called “dunder” methods, from “Double UNDERscore”. They are sometimes called “magic” methods, referring to the fact that they happen behind the scenes, but there is nothing magic about them. You just need to read the documentation to understand how things work. All operations in Python are actually executed by calling the corresponding dunder method.

11.2.2 Method Overloading

In Example 11.3 we completely overwrote the __str__() method to make the output of our dict look nicer. We don’t always want to overwrite the parent class behavior. Often we just want to augment it, which can be done by using “method overloading”.

Example 11.4 (Understand method overloading and super() function) Create a subclass of dict that prevents users from writing negative numbers, but let’s everything else be written.

Solution

When we do d[key] = value Python will call d.__setitem__(key, value) behind the scenes. Therefore, if we want to ensure value >= 0, we need to “rewrite” the __setitem__ method to check this and respond accordingly.

class MyDict4(dict):
    
1    def __setitem__(self, key, value):
2        if value >= 0:
3            super().__setitem__(key, value)
        else:
            print("Can't write negative numbers to this class")


d = MyDict4()
d['test'] = 1.0
d['test'] = -10

1: Where we are “overloading” the __setitem__ method of the dict class because our class will call this first. Our version must accept the same arguments as the parent class (dict) so that it has the same behavior. In this case our method must accepts key and value.
2: Here we perform our familiar conditional check. If this check passes, then we proceed to write the data on the next line, but there is a catch…
3: If we do self[key] = value we will create an infinite loop because it will trigger another call to __setitem__ and so on. The same happens if we skip straight to self.__setitem__(key, value). The solution to this conundrum is to call the __setitem__ method of the parent class, where this check does not occur. The super() function does this for us. The terms “parent class” and “child class” are synonymous with with “super class” and “subclass”, hence the function super().

Can't write negative numbers to this class

Comments

Here is a flow chart of the above logic:

flowchart TD
    A(Parent Class)
    B(Sub Class)
    C(Object)
    A -->|Inherit from parent class and augment __setitem__| B
    B --> C

Example 11.5 (Putting it all together) Create a subclass of a dict with the following special abilities. It should:

assign a name to the object during initialization
ensure that all values written to the object are numpy.ndarrays
provide a method that prints the shape of each array stored in each key

Solution

import numpy as np
from uuid import uuid4


class NDDict(dict):

1    def __init__(self):
        self.name = uuid4()

2    def __setitem__(self, key, value):
        try:
            value = np.array(value)
            super().__setitem__(key, value)
        except:
            print("Received value cannot be converted to an NDarray")
        
3    def info(self):
        s = []
        for k, v in self.items():
            s.append(k + ': ' + str(v.shape) + '\n')
        s = ''.join(s)
        return s


d = NDDict()
d['arr1'] = [[1, 2]]
print(d.info())

1: This method is overwritten
2: This method is overloaded
3: This method is added

arr1: (1, 2)

Comments

Note that we don’t accept any key-value pairs or dicts to the initialization. This way we are forcing all values that are written to go through the __setitem__ method to apply our rule that all values must be ndarrays.

11.3 Abstraction

The point of abstraction is to make the details of the implementation hidden from the user, but to provide them with a consistent interface.

Here we create two classes for different geometric shapes. Both have a “surface area”, but these are computed differently. The user doesn’t care about how we do it, they just want the result.

1class Circle():

2    def area(self, r):
        A = np.pi*r**2
        return A


class Sphere():

3    def area(self, r):
        A = 4*np.pi*r**2
        return A


c = Circle()
print(c.area(10))
s = Sphere()
print(s.area(10))

1: Note that we don’t have to inherit from any particular class. If we leave this space blank we’ll automatically inherit from Python’s generic object class, so this line is equivalent to class Circle(object):.
2: We define an area method, inside of which we compute the area for the shape corresponding to this particular subclass (Circle)
3: The Sphere subclass has a method with the same name as Circle and does effectively the same thing, but using the appropriate formula.

314.1592653589793
1256.6370614359173

11.4 Encapsulation

The final of the four main features of OOP is encapsulation, which means that the methods attached to the class are meant to be applied to the data attached to the same class.

Consider again the Circle and Sphere classes, but in this case we’ll attache r to them:

class Shape():

1    def __init__(self, r):
2        self.r = r


class Circle(Shape):

    def area(self): 
        A = np.pi*self.r**2
        return A


class Sphere(Shape):

    def area(self):
        A = 4*np.pi*self.r**2
        return A


circle = Circle(r=10)
print(circle.area())

1: Here we overwrite the __init__ method of the parent class. Our new init requires that users specify a value for r.
2: We then attaches the value of r to self. This value is then available inside all the methods. It is an intrinsic attribute of the object which we can rely on being present since it is impossible to to initialize an instance without it.

314.1592653589793

11.5 A Closer Look at Method Overloading

Warning: The following information is crucial to doing OOP properly, but is getting a bit deep into the weeds for the purposes of this class.

Correctly overloading a method can be a bit tricky because we must write a function that not only does our custom step(s) but also the step(s) that are required by the parent class. The main reason that can be tricky is that we need to deal with all the arguments that are passed to the method.

Let’s look at the how to overload the __init__ method of a dict.

The following implementation is wrong:

1from uuid import uuid4


class MyDict(dict):
    
2    def __init__(self):
3        self.name = str(uuid4())


d = MyDict()
print(d.name)

1: uuid is a Universally Unique IDentifier. uuid4 is a function which generates a unique string of random characters. How unique? If you generate 1.03 trillion values using uuid4, the probability of there being 2 duplicate values is one in a billion.
2: The __init__ function is a hidden function which gets called when you instantiate a class. When we call MyDict2() all the code inside the __init__() function gets run.
3: This line of code will be run when we instantiate MyDict2 which means that every instance of MyDict2 will have a unique name since uuid4() is called each time.

580df7c3-9f75-4818-b83a-9399926ef8e0

The reason the above implementation does not work is that d = MyDict(a=1) will not write 1 to the key 'a'. In fact, an error will occur because the __init__ method does not accept any arguments (other than self)!

The correct way to do this is:

from uuid import uuid4


class MyDict(dict):

1    def __init__(self, *args, **kwargs):
2        super().__init__(*args, **kwargs)
3        self.name = str(uuid4())


d = MyDict({'a': 1}, b=2)
print(d)

1: The * and ** here is a form argument unpacking/repacking. In Section 7.6 we saw that you can pass a list (or tuple) of positional arguments to a function using func(*args), and you can pass a dict of keyword arguments to a function using func(**kwargs). Here we are seeing a function receive this dict of arguments but keeping them in a dict called kwargs.
2: This line passes both args and kwargs to the __init__ method of the parent class using the * and ** syntax. This is crucial to OOP. Our version of __init__ does not do anything with the information in args and kwargs, so we just pass the “up” to the parent class to deal with. This allows us to augment the functionality of the parent class, by giving the parent class the information it needs to do it’s normal job, then adding our own bits.
3: After we’ve done the initialization of the parent class we can add our name.

{'a': 1, 'b': 2}

To summarize the above point:

The def ...(*args, ...) just means “collect all positional arguments in a tuple called args”.
The def ...(..., **kwargs) just means “collect all keyword arguments in a dict called kwargs”.
We are actually free to do whatever we want with args and kwargs. We choose to pass them on to the __init__ method of the parent class, but we don’t have to.
Also, we can call them whatever we want (i.e. *bob instead of *args), but args and kwargs are the conventional name, so only call them something else if it really important.