Python Objects: A Mass Information Drop
Introduction
Python is an object oriented programing (Sometimes shortened to OOP) language. An object is a collection of data (usually called variables) and the methods (or functions) that work on the object. Just about everything within python is one of these objects, and today we will talk about object id’s, types, and the difference between mutable and immutable objects.
ID and Type
If you have any experience with C code you probably already have a slight idea about what an object type is. When declaring a variable you would have to define the type of variable such as:
int n = 0;
In this scenario you just created a data object of type int labled n and set it to 0. In python the process of declaring a variable is simpler as it does not require you to pre-define your variable, so to achieve the same effect as above we would just use this line of code:
n = 0
Because python is a higher level language meant to remove some of the technical aspects, we do not have to declare the type before hand, nor do we have to end our statement with a semicolon; however, the type is still there and identifiable using the function type(). So for example, if we were to open a python interactive mode we could do the following:
>>> a = 0
>>> type(a)
<type 'int'>
>>> a = 'c'
>>> type(a)
<type 'str'>
First we declared a to be 0. Even though we did not specify a was going to be an integer(int) like we would in c, using type(a) confirmed that it was in fact being stored as an int. Another benefit of the python language is that variables are not restricted by types like it would be in C, so we were allowed to immediately assign a the value of ‘c’. Then when we tested the type of a again we were told that it was of type string(str). On the other side of object identity we have the id() function to check the id of an object once declared. Continuing on from the example above:
>>> a = 'c'
>>> type(a)
<type 'str'>
>>> id(a)
34091456L
>>> a = 0
>>> id(a)
33777472L
>>> a = 'c'
>>> id(a)
34091456L
This example lets us highlight a few key points about python objects. First we can see the id() function in action. Second we see the specific behavior of python when declaring a variable. Whereas in C declaring a variable uses a memory address and updating the variable will simply update what is stored in that address, when you declare a variable in python it creates an object and then creates a reference to that object; the id function returns the id of the object currently being referenced. When you reassign a, in our example above changing it from ‘c’ to 0, rather than updating what is stored in a it creates a new object, 0, and changes a to be a reference to the new object. The previous object, ‘c’, still exists. When you reassign a to ‘c’, the id remains the same.
*note on ids: they are instance specific. your id for ‘c’ will not match mine, and if I were to close my program and reopen it the id would be different form the instance before.
To further play with this idea we can run another test. Still within the same instance as above:
>>> b = 'c'
>>> id(b)
34091456L
When we declared b here and set it to ‘c’, we were simply creating a new reference to the object ‘c’, and if we did comparison tests, they would be treated the same:
>>> b is a
True
>>> b == a
True
The reason this works is because strings are a an immutable object, which brings us to our next topic: Mutable vs Immutable objects.
Immutable Objects
Immutable objects include integers(int), floats, booleans(bool), strings(str), unicode, and tuples. Once created an immutable object can not be changed. ‘c’ will always be ‘c’, 0 will always be 0, the tuple (1, 2) will always be (1, 2). When you do reassignments such as we did above rather than changing the object, it creates a new object and makes the variable reference the new object rather than the old.
Mutable Objects
Mutable objects include sets, lists, and dictionaries(dict), as well as most custom classes (a class is a set of variables and functions[known as methods] that are grouped together into a single object heading). Unlike immutable objects, once declared a mutable object can have its contents changed. The best way to understand what exactly being mutable means is to compare them to immutable objects.
Mutable vs Immutable
As stated before, once an immutable object is created, it can not be changed.
>>> a = 0
>>> id(a)
33777472L
>>> a += 1
>>> id(a)
33777448L
>>> a = a + 1
>>> id(a)
33777424L
>>> a -= 2
>>> id(a)
33777472L
>>> b = 2
>>> c = a + b
>>> id(c)
33777424L
The above example illustrates this. First we set a to 0, then we did some math and with each addition the program simple created a new object for the answer and changed what a referenced to, first the object 1, then the object 2, then back to 0. Then when we set b to 2 it simply made b reference the object ‘2’ that already existed. When we set c to a + b (or 0 + 2) it once again simply set c to the object 2 that already existed. On the other hand, if we take a mutable object like a list and change it a bit, the id stays the same:
>>> a = [1, 2, 3]
>>> id(a)
40974600L
>>> a.append(4)
>>> id(a)
40974600L
>>> a
[1, 2, 3, 4]
>>> a += [4]
>>> a
[1, 2, 3, 4, 4]
>>> id(a)
40974600L
As we can see here we created the list a, it was given an id that remained the same after adding to the list in two separate ways. Which brings up another point, whereas declaring two different variables to the same immutable object (a = ‘c’, b = ‘c’) caused them both to be references to the same object, immutable objects will be separate objects.
>>> s1 = "HNBN"
>>> s2 = "HNBN"
>>> id(s1)
40467712L
>>> id(s2)
40467712L
>>> la = [1, 2, 3]
>>> lb = [1, 2, 3]
>>> id(la)
41045320L
>>> id(lb)
41061000L
In this example we set two variables equal to the same string, HNBN, which created the object HNBN and then set both variables as reference to it, showing the immutable aspect of strings. On the other hand even though we set both lists to be [1, 2, 3], because lists are mutable objects both variables became references to their own list; however, if we had set lb to la it would possess the same id like so:
>>> la = [1, 2, 3]
>>> id(la)
41045320L
>>> lb = la
>>> id(lb)
41045320L
In this situation we had simply set lb to be a reference to the same object as la rather than declaring a new object. When we change la at this point it will (usually) also change lb, as seen below.
>>> la += [4]
>>> lb
[1, 2, 3, 4]
>>> la = la + [4]
>>> lb
[1, 2, 3, 4]
>>> la
[1, 2, 3, 4, 4]
Now I know what you are thinking, if lb is la, why did la = la + [4] change la but not lb? the answer is that when using append or +=, it adds a contained reference to the new data, whereas la = la + [4] created a new list combining two lists, as can be seen by checking their ids:
>>> id(lb)
41045320L
>>> id(la)
41121928L
lb still had the same id from back when la was [1, 2, 3] but la has a new id, indicating that a new object had been made.
What this means for functions
When calling a function, variables are always passed by reference, meaning that the function receives the memory address for the variable rather than receiving the data itself. For immutable objects this means whatever changes are made are lost since changing an immutable object simply creates a reference to a new object, like so:
>>> a = 0
>>> def adder(n):
... n+=1
...
>>> a
0
>>> adder(a)
>>> a
0
A mutable object however can have its contents changed, so if we change it in a function it would change outside the function as well:
>>>la = [0]
>>> def adder (n):
... n[0]+=1
...
>>> la
[0]
>>> adder(la)
>>> la
[1]