When starting out in Python it can be easy to think that the expression
list2 = list1
will make
list2
contain a copy of
list1
. But it doesn’t take long to realise that this doesn’t meet expectations of what actually happens in the wild.
Take the following code as an example:
>>> list1 = [1, 2, 3, 4, 5]
>>> list2 = list1
>>> list1.append(6)
>>> print(list1)
[1, 2, 3, 4, 5, 6]
>>> print(list2)
[1, 2, 3, 4, 5, 6]
Huh?
Why did
list2
contain the same contents as
list1
even when the insertion of additional elements in
list1
happened after
list2
had already been assigned
?
This behaviour can be confusing and even lead to unexpected bugs in code if not properly understood.
Overall, it is important to understand that when assigning one list to another in Python, the new list is not a copy of the original list. Instead, it is simply copying the reference of the original object to the new variable in memory.
This behaviour can be helpful in certain situations but can also lead to unexpected bugs if not properly managed.
Memory Management in Python
Python is a high-level programming language that is known for its simplicity and ease of use. However, it is important to understand the underlying memory management mechanism in Python to avoid unexpected behavior in your code.
Python uses a garbage collector to manage memory. Garbage collection is the process of automatically freeing memory that is no longer being used by the program. This means that Python automatically deallocates memory that is no longer needed, which makes it easier for developers to write code without worrying about memory management.
When variables are created in Python, they are stored in memory. When a variable is assigned a value, Python creates an object in memory to represent that value. If two variables are assigned the same value, they will reference the same object in memory. This is why
list2 = list1
in Python does not create a new copy of the original list but assigns the
id
of the object in memory to the new list
list2
.
Here’s how this looks when exploring this concept of memory management in Python when assigning variables by using the
id()
function to help:
>>> list1 = [1, 2, 3, 4, 5]
>>> id(list1)
140628380440576
As shown in the short Python IDLE snippet of code the
id()
function in Python returns the unique identifier (memory address) for the given object, which is an integer value. In this case, the object is
list1
, which contains elements 1, 2, 3, 4, and 5 and has a memory address of
140628380440576
on my machine
(when you run this code on your machine you will have a different memory identifier).
Note also that the actual memory address will vary each time you run the code, depending on where Python allocates memory for the list.
Therefore, by assigning a new variable
list2
to an existing variable
list1
you can see what is happening:
>>> list1 = [1, 2, 3, 4, 5]
>>> id(list1)
140628380440576
>>> list2 = list1
>>> id(list2)
140628380440576
Notice the above memory address of
list2
is exactly the same as
list1
, further proof can be seen by running this check:
>>> id(list1) == id(list2)
True
Therefore, as demonstrated above the assigning of an existing variable to a new one does not create a copy of the object referenced by the original variable.
Instead, the memory address reference is assigned to the new variable.
How Do You Create A Copy Of A List?
It is important to be aware of this behaviour when working with mutable objects like lists and dictionaries. If you want to create a new copy of a list, you can use the
list.copy()
method or the shorthand slice operator
[:]
. For example:
-
list2 = list1.copy()
-
list2 = list1[:]
Using these methods will create a new copy of the original list, which can be modified without affecting the original list.
Using the
id()
function, you can see that by using either of the above techniques that it creates a new instance of the original list:
>>> list1 = [1, 2, 3, 4, 5]
>>> id(list1)
140555758526848
>>> list2 = list1.copy()
>>> id(list2)
140555758526976
As you can see, the copying of the original
list1
list using the list method
.copy()
has created a distinct copy of the original list.
This would mean if the original list (or the copied list) were to change that it would not affect the other list.
>>> list1 = [1, 2, 3, 4, 5]
>>> list2 = list1.copy()
>>> list1.append(6)
>>> list1
[1, 2, 3, 4, 5, 6]
>>> list2
[1, 2, 3, 4, 5]
As you can see from the above output, the changes made on the original list
list1
by adding another element to the list
6
did not affect the copied list
list2
.
Does this always happen when copying lists in Python?
This type of copying in Python is known as shallow copying and can still cause problems if you’re not fully aware of the differences between this and deep copying .
Shallow Copy vs Deep Copy
When working with lists in Python, it’s important to understand the difference between a shallow copy and a deep copy. A shallow copy creates a new list object, but the elements of the new list still reference the same objects as the original list. In contrast, a deep copy creates a new list object and new objects for each element in the list.
Here’s an example demonstrating the problem of creating a shallow copy of a list which contains another list object as its element:
>>> list1 = [[1, 2, 3], 'a']
>>> id(list1[0])
140555758571968
>>> list2 = list1.copy()
>>> id(list2[0])
140555758571968
Notice in the above code that while a new copy of the original list was created as the first element in the list is itself another list the
id
of this first element references the same object.
This would mean should an edit on the first element occur it will be replicated in the copied list, as demonstrated below:
>>> list1 = [[1, 2, 3], 'a']
>>> list2 = list1.copy()
>>> list1[0].append(4)
>>> list1[0]
[1, 2, 3, 4]
>>> list2[0]
[1, 2, 3, 4]
As you can see from the above code even though only the first list’s element was changed by adding the element
4
, this changed the elements in the copied list as the copied list was
still referencing the original list
.
How To Create A Deep Copy
If you want to create a new list object with new elements that are not connected to the original list, you need to use the
deepcopy()
function from the
copy
module. This function creates a new list object and new objects for each element in the list. This means that any changes you make to the new list will not affect the original list.
Here’s an example demonstrating how this is used:
>>> from copy import deepcopy
>>> list1 = [[1, 2, 3], 'a']
>>> list2 = deepcopy(list1)
>>> list1[0].append(4)
>>> list1
[[1, 2, 3, 4], 'a']
>>> list2
[[1, 2, 3], 'a']
As you can see from the above output, the addition of a new element in the original list no longer affected the copied list. The evidence of this can further be shown by using the
id()
function on the first element of each list:
>>> id(list1[0])
140555758526848
>>> id(list1[0])
140555758573696
It’s important to keep in mind that creating a deep copy can be more time-consuming and memory-intensive than creating a shallow copy. If you only need to make simple changes to the elements of a list that does not contain objects, a shallow copy may be sufficient. However, if your list does contain objects and you want to ensure that the original list is not affected, a deep copy may be necessary.
So how does all this help to understand the concept of assignment when it comes to lists?
Why
list1 = list2
Does Not Create a New Copy
When using
list2 = list1
in Python, you know that it does not create a new copy of the original list. Instead, it creates a
reference
to the same list object in memory. This means that any changes made to the list through either variable will be reflected in both variables.
It’s important to note that this behaviour only applies to certain objects like lists. Objects like strings and integers will always create a new copy when assigned to a new variable.
Conclusion
When it comes to creating new copies of lists in Python, it is important to remember that the assignment operator (=) does not create a new copy of the list. Instead, it simply creates a reference to the original list.
While this behaviour may seem counterintuitive, it is actually quite useful in certain situations. For example, if you have a large list that you need to modify in multiple places throughout your code, creating a new copy of the list each time can be inefficient and memory-intensive. By using references to the original list, you can modify the list in place without having to create multiple copies.
However, it is important to be aware of the potential pitfalls of using references in this way. If you modify the original list in one place, those changes will be reflected in all references to the list. This can lead to unexpected behavior if you are not careful.
Overall, while the behaviour of the assignment operator when working with lists in Python may seem confusing at first, it is actually a powerful tool that can help you write more efficient and effective code. By understanding how references work in Python, you can take advantage of this behaviour to write cleaner, more concise code that is easier to maintain and debug.
Next you might like to explore more about the
id()
function in Python
that was referenced throughout this article.