
How do you sort a two-dimensional array in Python easily without importing libraries? Thankfully, there are some native functions in Python which make sorting arrays a breeze.
I recently had a project where I had the following two-dimensional data set, which was represented in the following way:
Customer ID | Invoice ID | Days Overdue | Invoice Total |
---|---|---|---|
ABC | 12 | 3 | $100 |
DEF | 10 | 5 | $200 |
GHI | 13 | 3 | $1,000 |
This same tabular data would be represented in a Python 2D array as follows:
data = [['ABC', 12, 3, 100],
['DEF', 10, 5, 200],
['GHI', 13, 3, 1000]]
The requirement I had with this data was to sort the list by the third column first in descending order , thereby placing at the top of the new sorted list the most overdue invoices. Then I wanted to sort the data by the third column second in descending order , placing the invoices with the highest amount overdue higher with invoices overdue on the same day.
To sort a two-dimensional list in Python use the
sort()
list method, which mutates the list, or the
sorted()
function, which does not. Set the
key
parameter for both types using a
lambda
function and return a tuple of the columns to sort according to the required sort order.
Using my code example above, here’s how both types work:
Sort List Method
One way to sort a two-dimensional list in Python is by using the
sort()
list method
. The
sort()
list method takes two parameters:
key
and
reverse
which enables you to set
what to sort
and
how to sort
.
If we apply this to our example above, here’s how this would look:
>>> data = [['ABC', 12, 3, 100],
['DEF', 10, 5, 200],
['GHI', 13, 3, 1000]]
>>> data.sort(key=lambda row: (row[2], row[3]), reverse=True)
>>> print(data)
[['DEF', 10, 5, 200], ['GHI', 13, 3, 1000], ['ABC', 12, 3, 100]]
Notice several things here: first, the original
data
variable’s state has changed, this is the principle of mutation at work.
By using this method, it will modify the original state of the variable operated on. Therefore, if the original state of the list before operation is important, then you want to avoid using this method on your list (see below for a non-mutating function).
The second thing to notice is the key parameter.
This parameter accepts a function, which I used as a
lambda
function, and iterates through each element in the list. Each element is a row of my two-dimensional list, which is labelled as the parameter
row
for the lambda function. Then we create a tuple containing what we want to sort.
In this example, I wanted to place the primary sort on the third column, which has an index of 2 in Python lists. Then I wanted to sort the fourth column, which has an index of 3. Therefore, this tuple contains references to those columns only and inserts their values into the tuple.
The third note is the parameter
reverse
which sets the descending order. And this was relatively easy considering that both of my requirements had the same sort method, but what if they didn’t?
Different Sort Order For Different Columns?
What if I wanted different sorting methods on different columns. For example, what if I wanted the third column to be in descending order, but I wanted the fourth column to be in ascending order?
To achieve this, we would drop the
reverse
parameter and operate on the values set in our lambda functions tuples, like so:
>>> data = [['ABC', 12, 3, 100],
['DEF', 10, 5, 200],
['GHI', 13, 3, 1000]]
>>> data.sort(key=lambda row: (-row[2], row[3]))
>>> print(data)
[['DEF', 10, 5, 200], ['ABC', 12, 3, 100], ['GHI', 13, 3, 1000]]
Did you see the changes?
Besides removing the
reverse
parameter, have a look at the first tuple entry in our lambda function:
-row[2]
notice how there is a negative sign in front of the row value.
By removing the
reverse
parameter it will sort all values in ascending order by default, and by placing a negative on the numeric values contained in my third column it puts the larger negative numbers on top.
Sorted Function
If you want to keep the state of the original list and want to return a new 2-dimensional list then you’ll want to use the
sorted
function
.
The
sorted
function has the same parameters as the
sort
list method
used above, but also one additional parameter at the front to inform what data is being sorted, the only difference being that it returns a new list, as shown below:
>>> data = [['ABC', 12, 3, 100],
['DEF', 10, 5, 200],
['GHI', 13, 3, 1000]]
>>> new_data = sorted(data, key=lambda row: (row[2], row[3]), reverse=True)
>>> print(new_data)
[['DEF', 10, 5, 200], ['GHI', 13, 3, 1000], ['ABC', 12, 3, 100]]
Again, if the requirements of the sort are to be different according to column types then we can remove the
reverse
parameter (which defaults to ascending order) and then prefix our tuple elements with a negative sign for those which we want to have in descending order, like so:
>>> data = [['ABC', 12, 3, 100],
['DEF', 10, 5, 200],
['GHI', 13, 3, 1000]]
>>> new_data = sorted(data, key=lambda row: (-row[2], row[3]))
>>> print(data)
[['DEF', 10, 5, 200], ['ABC', 12, 3, 100], ['GHI', 13, 3, 1000]]
Summary
To sort a two-dimensional list using multiple columns and different sorting methods (eg. descending order for one, ascending order for another) in Python, without using any imported libraries, use the built-in
sort()
list method and
sorted()
function.
By using the built-in
sort
list method you can mutate the original list to your desired requirements, whereas by using the
sorted
function it will return a new sorted 2D list.
Another popular way of sorting items in Python is using functions in the
lambda
function as seen in the example where I
sort items in a list based on their string length
.