Why is it when you copy a list in Python doing
b_list = a_list
that, any changes made to
a_list
or to
b_list
modify the other list?
If you’ve played with lists in Python you will reach a point where you want to copy a list to make modifications to it without changing the original list. Initially you would think the following would work:
>>> a_list = [1, 2, 3]
>>> b_list = a_list
>>> b_list.append(4)
>>> print(b_list)
[1, 2, 3, 4]
>>> print(a_list)
[1, 2, 3, 4]
As you can see from the above example even though
b_list
was the only list which had an extra element added to it,
a_list
also changed!
Why Does A Copied List Change?
The reason why a list, when “copied” using the approach of
b_list = a_list
, doesn’t work as expected is that the new variable
b_list
is assigned the
reference
of
a_list
. It does not create an
actual copy
of the list.
You can know this by checking the
id()
of each variable.
What Does
id()
Do?
Python’s
id()
function
returns a unique identifier for the variable passed as an argument. This object can be any type of variable, such as a dictionary, a list, a tuple, etc. The returned value is an integer and is guaranteed to be unique and constant for this object during its lifetime.
Using the above example again you can see that when the
b_list
variable is assigned
a_list
it has the same unique identifier from
id()
:
>>> a_list = [1, 2, 3]
>>> b_list = a_list
>>> id(b_list)
140272962785472
>>> id(a_list)
140272962785472
>>> id(b_list) == id(a_list)
True
As you can see, both variables have the same
id
. Therefore, whenever you are “copying” a variable using a simple variable assignment like
b_list = a_list
you now have two unique variables
operating on the same object
.
Create An Actual Copy Of A List
So how do you create an actual copy of a list without just copying the reference of the original list?
There are several ways to copying a list, each has their unique approaches and constraints, and these are:
-
Using the
list.copy()
method - Using the copy.deepcopy() method
-
Using the
empty slice operator
[:]
-
Using the
list()
constructor -
Using
json.dumps()
and thenjson.load()
I will demonstrate how you can use each approach with examples.
list.copy()
Method
The
list.copy()
method in Python
is used to create a
shallow copy
of the original list object. This means that a new object is created containing the same values as the original object and is stored in different memory locations.
This means when you are operating on one list, such as appending or removing items, it will not affect the elements contained in the other.
Here’s an example of how this method works and how it creates a unique
id
:
>>> a_list = [1, 2, 3]
>>> id(a_list)
140272962785472
>>> b_list = a_copy.copy()
>>> id(b_list)
140272962785600
>>> id(a_list) == id(b_list)
False
As you can see from the above Python code output the
b_list
variable created has its own unique identifier. This means that
b_list
can be operated on independent of
a_list
meaning that changes made on either
b_list
or
a_list
will not change the other.
>>> b_list.append(4)
>>> print(b_list)
[1, 2, 3, 4]
>>> print(a_list)
[1, 2, 3]
Notice how
b_list
changes, and the change made to that list doesn’t impact the original
a_list
which was copied.
However, there is one thing to be mindful of with the
.copy()
method: it is a
shallow copy
. This means that the original list contains
mutatable objects
(objects that can be changed, for example: lists, dictionaries, etc.).
Shallow Copy Problems
The issue with shallow copying objects is that if the copied object contains mutating objects, like lists or dictionaries, that changes made to those mutating elements in the original list will only be copied as references.
Here’s an example of a simple list with a mutating object within it and how using the
.copy()
method produces issues:
>>> a_list = [[1], 2, 3]
>>> b_list = a_list.copy()
>>> for a in a_list:
... print(id(a))
...
140272962833984
140272960989520
140272960989552
>>> for b in b_list:
... print(id(b))
...
140272962833984
140272960989520
140272960989552
>>> id(a_list) == id(b_list)
False
As demonstrated in the above code, the two lists both have unique identifiers, which means the contents can change without impacting the other. However, notice the first element in each list points to the same id .
From what you’ve learned at the very start of this article you’ll remember that when items such as lists contain the same id changes made to that object will change the other.
Continuing on from the above code, and modifying that first element produces the following startling result:
>>> a_list[0].append(2)
>>> print(a_list)
[[1, 2], 2, 3]
>>> print(b_list)
[[1, 2], 2, 3]
>>> id(a_list[0]) == id(b_list[0])
True
As you can see from the above output the changes performed on the first element in
a_list
ended up modifying
b_list
even though each list has its own unique identifier!
It’s not until you inspect the first element’s
id
on both lists that you realise they have the same reference.
This is the one downside when copying lists using the shallow copy
.copy()
method on your original list: if your original list contains lists or dictionaries (mutatable objects) then these are copied by reference.
So how do you overcome this problem?
This is why another more popular method for copying lists circumvents this issue and is found in the
copy
library. The method is called
.deepcopy()
.
copy.deepcopy()
Method
If your list contains mutating objects such as lists and dictionaries then would want to prefer the other copying method
copy.deepcopy()
over
list.copy()
.
For this method to work you will need to import the
copy
library using
import copy
at the top of your code before calling the method
copy.deepcopy()
.
In the
Python documentation on
.deepcopy()
this method constructs its own compound objects from the original list. The best way to demonstrate the difference is through an example very similar to one above:
>>> import copy
>>> a_list = [[1], 2, 3]
>>> b_list = copy.deepcopy(a_list)
>>> a_list[0].append(2)
>>> print(a_list)
[[1, 2], 2, 3]
>>> print(b_list)
[[1], 2, 3]
>>> id(a_list[0])
140272962839936
>>> id(b_list[0])
140272962839040
>>> id(a_list[0]) == id(b_list[0])
False
From the above output from the Python REPL you can see that using
copy.deepcopy()
produces a distinct copy of the original list. Even the nested list in the first element of
a_list
when amended does not alter the copied
b_list
first element, and running the
id()
function on the first element shows they both have unique identifiers.
The only issue with using this option is that it does take longer to perform copies and you will need to remember to import the
copy
library. If speed isn’t a concern and you’re dealing with elements within a list that are likely to be lists or dictionaries then this should be your
preferred method when copying lists
.
Other List Copying Hacks
There are other techniques to perform shallow copies of a list, while these produce the same result they are different ways of writing it in Python:
Shallow Copy Slice Operator
[:]
The
empty slice operator
performs the same as
list.copy()
as demonstrated below:
>>> a_list = [[1], 2, 3]
>>> b_list = a_list[:]
>>> id(a_list) == id(b_list)
False
>>> id(a_list[0]) == id(b_list[0])
True
As you can see from the example above the newly created
b_list
is a shallow copy of the original
a_list
but is not a deep copy of the original due to the first element in
a_list
being the same as
b_list
.
If typing
.copy()
to the original list is 4 characters too long then you can use this shortcut technique to create a shallow copy of your original list.
list()
Constructor
The list constructor allows for changing a variable into the list data type . If you’re already inserting a list into the constructor then no real conversion is needed, but it does produce a new shallow copy list, as demonstrated below:
>>> a_list = [[1], 2, 3]
>>> b_list = list(a_list)
>>> id(a_list) == id(b_list)
False
>>> id(a_list[0]) == id(b_list[0])
True
Once again another example of how to create a shallow copy of the original list.
json.dumps()
And
json.loads()
If you want another technique for creating a deep copy of the original list then another approach could be to extract the original list using
json.dumps
(which converts the list to a string) and then to use
json.loads
to read the string and create another list.
Here’s how this process looks and the resulting
id
checks:
>>> import json
>>> a_list = [[1], 2, 3]
>>> b_list = json.loads(json.dumps(a_list))
>>> id(a_list) == id(b_list)
False
>>> id(a_list[0]) == id(b_list[0])
False
As you can see from this result using the
json.loads
in combination with the
json.dumps()
methods you can achieve a deep copied list that has its own references for the elements in each list.
Summary
When assigning a new list variable from an existing one it can be easy to think that the contents of the copied list are cloned into the new list. However, it doesn’t take long until you realise that the new list actually makes changes to the original list!
To make distinct clones of the original list determine whether you will need to do a shallow or deep copy of the list first. Once you know the type of copying you will need to do you can then apply any of the techniques above to accomplish what you need.
Popular shallow copying techniques for lists involve:
list.copy()
,
copy.copy()
,
list[:]
and
list(list)
. All of these create a shallow copy meaning that if the list contains elements such as lists or dictionaries and these are likely to be modified
after
the assignment this will change both lists too.
Popular deep copying techniques for lists involve using the
copy.deepcopy()
or
json.loads(json.dumps(list))
techniques. Each method requires either the
copy
or the
json
library respectively.