Why is it when you copy a list in Python doing b_list = a_list
that, any changes made to a_list
or to b_list
modify the other list?
If you’ve played with lists in Python you will reach a point where you want to copy a list to make modifications to it without changing the original list. Initially you would think the following would work:
|
|
As you can see from the above example even though b_list
was the only list which had an extra element added to it, a_list
also changed!
Why Does A Copied List Change?
The reason why a list, when “copied” using the approach of b_list = a_list
, doesn’t work as expected is that the new variable b_list
is assigned the reference of a_list
. It does not create an actual copy of the list.
You can know this by checking the id()
of each variable.
What Does id()
Do?
Python’s id()
function returns a unique identifier for the variable passed as an argument. This object can be any type of variable, such as a dictionary, a list, a tuple, etc. The returned value is an integer and is guaranteed to be unique and constant for this object during its lifetime.
Using the above example again you can see that when the b_list
variable is assigned a_list
it has the same unique identifier from id()
:
|
|
As you can see, both variables have the same id
. Therefore, whenever you are “copying” a variable using a simple variable assignment like b_list = a_list
you now have two unique variables operating on the same object.
Create An Actual Copy Of A List
So how do you create an actual copy of a list without just copying the reference of the original list?
There are several ways to copying a list, each has their unique approaches and constraints, and these are:
- Using the
list.copy()
method - Using the copy.deepcopy() method
- Using the empty slice operator
[:]
- Using the
list()
constructor - Using
json.dumps()
and thenjson.load()
I will demonstrate how you can use each approach with examples.
list.copy()
Method
The list.copy()
method in Python is used to create a shallow copy of the original list object. This means that a new object is created containing the same values as the original object and is stored in different memory locations.
This means when you are operating on one list, such as appending or removing items, it will not affect the elements contained in the other.
Here’s an example of how this method works and how it creates a unique id
:
|
|
As you can see from the above Python code output the b_list
variable created has its own unique identifier. This means that b_list
can be operated on independent of a_list
meaning that changes made on either b_list
or a_list
will not change the other.
|
|
Notice how b_list
changes, and the change made to that list doesn’t impact the original a_list
which was copied.
However, there is one thing to be mindful of with the .copy()
method: it is a shallow copy. This means that the original list contains mutatable objects (objects that can be changed, for example: lists, dictionaries, etc.).
Shallow Copy Problems
The issue with shallow copying objects is that if the copied object contains mutating objects, like lists or dictionaries, that changes made to those mutating elements in the original list will only be copied as references.
Here’s an example of a simple list with a mutating object within it and how using the .copy()
method produces issues:
|
|
As demonstrated in the above code, the two lists both have unique identifiers, which means the contents can change without impacting the other. However, notice the first element in each list points to the same id.
From what you’ve learned at the very start of this article you’ll remember that when items such as lists contain the same id changes made to that object will change the other.
Continuing on from the above code, and modifying that first element produces the following startling result:
|
|
As you can see from the above output the changes performed on the first element in a_list
ended up modifying b_list
even though each list has its own unique identifier!
It’s not until you inspect the first element’s id
on both lists that you realise they have the same reference.
This is the one downside when copying lists using the shallow copy .copy()
method on your original list: if your original list contains lists or dictionaries (mutatable objects) then these are copied by reference.
So how do you overcome this problem?
This is why another more popular method for copying lists circumvents this issue and is found in the copy
library. The method is called .deepcopy()
.
copy.deepcopy()
Method
If your list contains mutating objects such as lists and dictionaries then would want to prefer the other copying method copy.deepcopy()
over list.copy()
.
For this method to work you will need to import the copy
library using import copy
at the top of your code before calling the method copy.deepcopy()
.
In the Python documentation on .deepcopy()
this method constructs its own compound objects from the original list. The best way to demonstrate the difference is through an example very similar to one above:
|
|
From the above output from the Python REPL you can see that using copy.deepcopy()
produces a distinct copy of the original list. Even the nested list in the first element of a_list
when amended does not alter the copied b_list
first element, and running the id()
function on the first element shows they both have unique identifiers.
The only issue with using this option is that it does take longer to perform copies and you will need to remember to import the copy
library. If speed isn’t a concern and you’re dealing with elements within a list that are likely to be lists or dictionaries then this should be your preferred method when copying lists.
Other List Copying Hacks
There are other techniques to perform shallow copies of a list, while these produce the same result they are different ways of writing it in Python:
Shallow Copy Slice Operator [:]
The empty slice operator performs the same as list.copy()
as demonstrated below:
|
|
As you can see from the example above the newly created b_list
is a shallow copy of the original a_list
but is not a deep copy of the original due to the first element in a_list
being the same as b_list
.
If typing .copy()
to the original list is 4 characters too long then you can use this shortcut technique to create a shallow copy of your original list.
list()
Constructor
The list constructor allows for changing a variable into the list data type. If you’re already inserting a list into the constructor then no real conversion is needed, but it does produce a new shallow copy list, as demonstrated below:
|
|
Once again another example of how to create a shallow copy of the original list.
json.dumps()
And json.loads()
If you want another technique for creating a deep copy of the original list then another approach could be to extract the original list using json.dumps
(which converts the list to a string) and then to use json.loads
to read the string and create another list.
Here’s how this process looks and the resulting id
checks:
|
|
As you can see from this result using the json.loads
in combination with the json.dumps()
methods you can achieve a deep copied list that has its own references for the elements in each list.
Summary
When assigning a new list variable from an existing one it can be easy to think that the contents of the copied list are cloned into the new list. However, it doesn’t take long until you realise that the new list actually makes changes to the original list!
To make distinct clones of the original list determine whether you will need to do a shallow or deep copy of the list first. Once you know the type of copying you will need to do you can then apply any of the techniques above to accomplish what you need.
Popular shallow copying techniques for lists involve: list.copy()
, copy.copy()
, list[:]
and list(list)
. All of these create a shallow copy meaning that if the list contains elements such as lists or dictionaries and these are likely to be modified after the assignment this will change both lists too.
Popular deep copying techniques for lists involve using the copy.deepcopy()
or json.loads(json.dumps(list))
techniques. Each method requires either the copy
or the json
library respectively.