Recently, I had an exercise where I needed to flatten a two-dimensional list down to just one dimension, something where I needed the result to be like this:
[[1, 2, 3], [4, 5, 6], [7, 8, 9]] => [1, 2, 3, 4, 5, 6, 7, 8, 9]
There were a couple of ways I completed this task, one involved using the common for-loop process but as a one-liner, another involved using a standard Python function, and the third way introduced me to the concept of recursion in Python .
Let’s see how this evolved:
Multiple For-Loops & List Comprehension (One Liner)
The most “natural” way for most people to tackle this problem is to just use the popular for loop in Python with list comprehensions . It’s simple, effective, everybody would be able to understand what is going on and can easily be done as a one liner, like so:
my_2d_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
my_flat_list = [cell for row in my_2d_list for cell in row]
print(my_flat_list)
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
The benefit of using this process is that you can apply changes to the values within the list as already explored in my previous article with list comprehensions .
If the input is known to be two-dimensional and there will be no surprises, then this could be a quick and easy way to flatten a two-dimensional list.
sum()
Function
Another way is to use Python’s standard
sum()
function – which just
accumulates
elements within lists.
While this method may impress your boss, it may not initially be apparent on what is going on.
First, let’s have a look at a 2D list example using this
sum
function:
my_2d_list = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
result = sum(my_2d_list, [])
print(result)
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
Upon further investigation of the sum function according to the Python docs , the second parameter is the starting value. Let’s explore this a little further.
Why does this work?
If I run the following tests, here are some insights on the second
start
parameter:
a = [[1, 2, 3], 4]
sum(a)
Traceback (most recent call last):
File "<input>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'list'
The error received from this operation is quite helpful. Here when applying the
+
operator on each element of the list it runs into a type error because when iterating through each element of the list, there are different types.
In step form, it looks like this:
1. Get first element of list 'a' = [1, 2, 3]
2. Get second element of list 'a' = 4
3. Get step 1 value (list) + step 2 value (int)
ERR - cannot + list with int
If we changed the elements within the list to this:
a = [[1, 2, 3], [4]]
sum(a)
# [1, 2, 3, 4]
We would get a result equivalent to this because list concatenation permits the use of
+
operator when combining lists together:
[1, 2, 3] + [4]
# [1, 2, 3, 4]
But what happens when I use a list for the second parameter of the
sum
function?
If I use a simpler version to start you can see what happens when I add a value to the second parameter of the sum function:
a = [1, 2, 3, 4]
sum(a, 1)
# 11
sum(a, 1.5)
# 11.5
Those examples above would be the equivalent of:
1 + 1 + 2 + 3 + 4 = 11
1.5 + 1 + 2 + 3 + 4 = 11.5
Notice how the number 1 (or 1.5) used in the second parameter of the sum function is the starting value of the accumulation of all values in the list.
(For those familiar with the
reduce
array function in JavaScript it operates in the same way – the second parameter being the accumulator’s starting value.)
Therefore, if we change our second parameter to be a list and because we can apply the
+
operator on lists, it just concatenates other lists to the accumulator.
a = [[1, 2, 3], [4], [5, 6]]
sum(a, [])
# [1, 2, 3, 4, 5, 6]
This is the equivalent of doing the following:
[] + [1, 2, 3] + [4] + [5, 6]
Flatten 2D List & Merge (One-Liner)
We could use our newfound understanding applying the same logic when seeking to flatten a two-dimensional list and merge it with an existing one-dimensional list because any list could be used as the second parameter in the
sum
function.
Here’s an example:
a = [[4, 5, 6], [7], [8, 9]]
b = [1, 2, 3]
sum(a, b)
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
To achieve the same result with the previous multiple for-loop with list comprehension method above, you would have to do the following adding an extra couple of lines of code:
a = [[4, 5, 6], [7], [8, 9]]
b = [1, 2, 3]
c = [cell for row in a for cell in row]
d = b + c
print(d)
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
Flattening Problems With Sum & List Comprehensions
However, the biggest problem with these two previous methods is that it’s assumed each element within the original list is a list data type .
What do you do when you know that elements within your list could be multiple data types?
Flatten List Using Function Recursion
Another way we can flatten a list (even if it’s multi-dimensional ) is by creating a custom function that calls itself. This is known as recursion .
Let’s look at an example and break it down:
def flatten_list(lst, accum=[], idx=0):
if idx >= len(lst):
return accum
el = lst[idx]
if type(el) == list:
flatten_list(el, accum)
else:
accum.append(el)
idx += 1
return flatten_list(lst, accum, idx)
Firstly, I’ve named the function
flatten_list
and have three parameters:
lst
the multi-dimensional list to flatten; the
accum
accumulator which by default is a one-dimensional list, but could be pre-populated with a one-dimensional list if needed (as we saw above with the standard
sum
function); and the
idx
index to start (defaulted to start with the first element in the list).
Inside the recursion function, the first operation I have done is to determine if the index value is larger than the length of the list being operated on. If so, return the
accum
accumulated list.
Next, I obtain the element within the list according to its index and save this to a variable labelled
el
.
The first check on the element
el
is to determine if it’s a list data type. If so, we enter our first recursion call – we send through the element to the same function, along with what has been accumulated so far.
Otherwise, if the element
el
is not a list item, it is appended to the end of the accumulated list value.
Finally, within our recursive function we iterate the index number up one value, and then send through the same
lst
and what has been accumulated, along with the new index value.
Let’s see this recursive function in a few tests:
a = [[1, 2, 3], [4], [5, [6, [7, 8]], 9]]
b = flatten_list(a)
print(b)
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
a = [1, [{2}, '3'], [4, [5, [6]], [7], 8, 9]]
b = flatten_list(a)
print(b)
# [1, {2}, '3', 4, 5, 6, 7, 8, 9]
As you can see from the above examples, our recursive list function works as anticipated – all multi-dimensional lists are flattened to a one-dimensional list.
Step By Step Recursion Function (Using Flatten List)
I’m now going to modify my recursive function by putting some
print
statements inside to show you what is happening within the operation of my function:
def flatten_list(lst, accum=[], idx=0):
print(f'List={lst}; accum={accum}, idx={idx}')
if idx >= len(lst):
print(f'Return accum={accum}')
return accum
el = lst[idx]
if type(el) == list:
print(f'Call again::List={el} accum={accum}')
flatten_list(el, accum)
else:
accum.append(el)
idx += 1
print(f'Continue::List={lst} accum={accum} idx={idx}')
return flatten_list(lst, accum, idx)
If I apply a simple example with this flatten function with print statements, here is the output:
a = [[1, 2], 3, [4]]
flatten_list(a)
Up first we get our initial entry into the function:
List=[[1, 2], 3, [4]]; accum=[], idx=0
As the first element is of
list
data type, it proceeds to call the function again, so we see the next two statements as:
Call again::List=[1, 2] accum=[]
List=[1, 2]; accum=[], idx=0
Now that we are inside the function with the first element, which is a list, what will happen next? Is the first element of this newly inserted list a list again? No. Therefore, it should proceed:
Continue::List=[1, 2] accum=[1] idx=1
We progress through to the bottom of the function, and as you can see from this print statement, the accumulator contains values, and the index has incremented to 1.
What is going to happen next?
List=[1, 2]; accum=[1], idx=1
Continue::List=[1, 2] accum=[1, 2] idx=2
The recursive function now handles the second element of the list and as it is not a list itself it progresses through the function and appends to the accumulator, and increments the index value.
What will happen next? We have an index value of 2 and the size of the list is 2.
List=[1, 2]; accum=[1, 2], idx=2
Return accum=[1, 2]
Here we can see the accumulator is returned, with the first condition in our recursion being satisfied.
What happens next?
Continue::List=[[1, 2], 3, [4]] accum=[1, 2] idx=1
The code now returns back to what it was when it first started – back with the original list, but notice a couple of things: the accumulator contains the list of values returned and the
idx
value is 1 not 2.
The original state of the
idx
value is restored to what it was prior to the recursion.
What happens next?
List=[[1, 2], 3, [4]]; accum=[1, 2], idx=1
Continue::List=[[1, 2], 3, [4]] accum=[1, 2, 3] idx=2
The next element in our original list is a numeric value, and therefore just gets added to our accumulator, the idx variable increments up one, and we’re ready to proceed to the next element.
What happens next?
List=[[1, 2], 3, [4]]; accum=[1, 2, 3], idx=2
Call again::List=[4] accum=[1, 2, 3]
As the next element in our list is a list data type it calls the flatten function again by passing in that element.
List=[4]; accum=[1, 2, 3], idx=0
Continue::List=[4] accum=[1, 2, 3, 4] idx=1
The
idx
value of
0
is used as we start a new iteration through another list and as the only element within this list is a numeric value it proceeds through, and as you can see gets appended to the accumulator (
idx
also increments).
List=[4]; accum=[1, 2, 3, 4], idx=1
Return accum=[1, 2, 3, 4]
As this list only contains one element, the index equals the length of the list and therefore returns what has been accumulated.
Continue::List=[[1, 2], 3, [4]] accum=[1, 2, 3, 4] idx=3
When we pop out of this recursion call we progress through the remainder of the function and increment the index.
List=[[1, 2], 3, [4]]; accum=[1, 2, 3, 4], idx=3
Return accum=[1, 2, 3, 4]
Finally, the last pass through this process sees it back with the original list, an index value that matches the length of the original list and therefore the output is the accumulator, being the result
[1, 2, 3, 4]
.
Why did the first recursive call not include a
return
statement, but the second call did?
You would have noticed in the
flatten_list
recursive function that the first recursive call made within that function did not have a
return
statement preceding the call, yet the second recursive call at the bottom of the function did – why is that?
If you think about it, you don’t want to return after processing the first call. The purpose of the first call is to go into the element that is a list and to flatten it.
After it has been flattened, you want to continue processing. By placing a return statement at the first call, you are stating that you DO NOT want to proceed any further: how then can you iterate to the next element?
The reason why the second call has a return statement is that the parameters placed in the calling function contain the next iteration in the list.
Be careful when creating your own recursive function, and ask yourself how the function can continue to iterate, and what is to be returned.
Flatten & Merge Multi-Dimensional List With List
Just as I explored above with Python’s standard
sum
function and flattening a two-dimensional list into a list, I can similarly apply the same to my custom multi-dimensional list flattening function here as shown:
a = [[4, 5], 6, [7]]
b = [1, 2, 3]
c = flatten_list(a, b)
print(c)
# [1, 2, 3, 4, 5, 6, 7]
Summary
In this article, I explored how to flatten a two-dimensional list in Python into a one-dimensional list. There are several ways to achieve this task, and each method has a slight nuance depending upon the needs of the user.
If you know for certain the values of the original list all contains lists, then the flattening method using sum function and flattening using multiple for-loops and list comprehension are good easy choices to use.
If you wanted to perform further operations on the values inserted into the flattened list then the flattening using multiple for-loops with list comprehension would be the better choice of the two.
However, if you’re uncertain as to what the data types of each element within the list would contain then it might be best using a custom recursive function in Python .