How do you sort a two-dimensional array in Python easily without importing libraries? Thankfully, some native functions in Python make sorting arrays a breeze.
I recently had a project where I had the following data structures, representing each unpaid invoice by the customer, the days they were overdue and how much was outstanding.
Here’s a simplified sample of the data I was working with (the first row contains the name of the columns of the array):
Customer ID | Invoice ID | Days Overdue | Overdue Amount |
---|---|---|---|
ABC | 12 | 3 | $100 |
DEF | 10 | 5 | $200 |
GHI | 13 | 3 | $1,000 |
Converting this data set into an array of arrays, or a list of lists, in Python was fairly trivial and looked something like this:
data = [
['ABC', 12, 3, 100],
['DEF', 10, 5, 200],
['GHI', 13, 3, 1000]
]
The sorting process requirements I had with this data was to:
- Sort the array in descending order by the third column . This would place the longest overdue invoices at the top of the new array.
- Sort the array further by the fourth column in descending order. This would rank the outstanding amount so that invoices with the same days overdue would have those with the higher amount first in their cluster.
To sort a 2-d array in Python, you can use these methods and functions:
-
sort()
– which mutates the original array or -
sorted()
– which keeps the original array intact.
To set the way in which the array elements are sorted, set the
key
parameter and use a
lambda
function and return a tuple of the columns to sort according to the required sort order.
Using my code example above, here’s how both approaches work in quick sorts:
Sort List Method
One way to sort a two-dimensional list in Python is by using the
sort()
list method
. The
sort()
list method takes two parameters:
key
and
reverse
which enables you to set
what to sort
and
how to sort
.
If we apply this to our example above, here’s how this would look:
>>> data = [['ABC', 12, 3, 100],
['DEF', 10, 5, 200],
['GHI', 13, 3, 1000]]
>>> data.sort(key=lambda row: (row[2], row[3]), reverse=True)
>>> print(data)
[['DEF', 10, 5, 200], ['GHI', 13, 3, 1000], ['ABC', 12, 3, 100]]
Here’s a demonstration of this code in the wild and the final array output:
Notice several things here: first, the original
data
variable’s state has changed. This is the principle of
mutation
at work.
By using this approach, the original array of the variable operated on will change once the
.sort()
method has been processed. Therefore, if the original state of the list before operation is
important
, then you
want to
avoid
using this method
on your list (see below for a non-mutating function).
The second thing to notice is the key parameter.
The
key
parameter accepts a function, which I used as a
lambda
function (a regular function would be just as fine), and iterates through each element in the list. Each element is a row of my two-dimensional list, which is labelled as the parameter
row
for the lambda function.
From this, I create a tuple containing what field I want to sort first, second (etc).
In this example, I wanted to place the primary sort on the
third column
, which has an index position of
2
. Then, I wanted to sort the
fourth column
, which has a column index of 3. Therefore, this tuple only references those columns and inserts their values into the tuple.
The third note is the parameter
reverse
that sets the descending order. By default, this value is
False
. This was relatively easy considering that both of my requirements had the same sort of requirement (descending order), but what if they didn’t?
Different Sort Order For Different Columns?
What if I wanted different sorting methods on different columns?
For example, what if I wanted the third column to be in descending order , but I wanted the fourth column to be in ascending order ?
To achieve this, we would drop the
reverse
parameter and operate on the values set in our lambda functions tuples, like so:
>>> data = [['ABC', 12, 3, 100],
['DEF', 10, 5, 200],
['GHI', 13, 3, 1000]]
>>> data.sort(key=lambda row: (-row[2], row[3]))
>>> print(data)
[['DEF', 10, 5, 200], ['ABC', 12, 3, 100], ['GHI', 13, 3, 1000]]
Here’s this code demonstrated:
Did you notice the changes?
Besides removing the
reverse
parameter, have a look at the first tuple entry in our lambda function:
-row[2]
notice how there is a negative sign in
front
of the row value.
By removing the
reverse
parameter, it will sort all values in
ascending order
by default, and by placing a negative on the numeric values contained in my
third column
, it puts the larger negative numbers on top which achieves the purpose of descending order.
With the second tuple value not being negative, it achieves the purpose of the results for that column being placed in ascending order .
Therefore, using a negative sign in front of each tuple value can help achieve the desired result if you find the
reverse
parameter is putting your results in the wrong order.
Sorted Function
If you want to keep the
state
of the original list and want to return a
new 2-dimensional list
, then you’ll want to use the
sorted
function
.
The
sorted
function has the same parameters as the
sort
list method
used above but also one additional parameter at the front to inform what data is being sorted. And the only real noticeable difference being that it returns a new list, as shown below:
>>> data = [['ABC', 12, 3, 100],
['DEF', 10, 5, 200],
['GHI', 13, 3, 1000]]
>>> new_data = sorted(data, key=lambda row: (row[2], row[3]), reverse=True)
>>> print(new_data)
[['DEF', 10, 5, 200], ['GHI', 13, 3, 1000], ['ABC', 12, 3, 100]]
Here’s the
sorted()
function in action:
Notice in the video above that the original array
data
isn’t changed. This is the beauty of the
sorted
function.
Again, if the requirements of the sort are to be different according to column types, then we can remove the
reverse
parameter (which defaults to ascending order) and then prefix our tuple elements with a negative sign for those that we want to have in descending order, like so:
>>> data = [['ABC', 12, 3, 100],
['DEF', 10, 5, 200],
['GHI', 13, 3, 1000]]
>>> new_data = sorted(data, key=lambda row: (-row[2], row[3]))
>>> print(data)
[['DEF', 10, 5, 200], ['ABC', 12, 3, 100], ['GHI', 13, 3, 1000]]
Sorted Function Doesn’t Work
Sometimes, I can easily get confused between the
.sort()
list method and the
sorted()
function. Meaning I’ll want to mutate the original 2d array, and I’ll end up writing my code something like this expecting the
data
variable to be mutated:
>>> data = [
['ABC', 12, 3, 100],
['DEF', 10, 5, 200],
['GHI', 13, 3, 1000]
]
>>> sorted(data, key=lambda row: (-row[2], row[3]))
>>> print(data)
[['ABC', 12, 3, 100], ['DEF', 10, 5, 200], ['GHI', 13, 3, 1000]]
Remember the
sorted()
function will not mutate the original array, and when inspecting your result, you will assume your variable should have changed, but it didn’t (or vice versa – expecting the original array NOT to change, and it did!).
If things look strange, just check which approach you’ve used.
Sorted Array With Multiple Columns In Python
To sort a two-dimensional list using multiple columns and different sorting methods (eg. descending order for one, ascending order for another) in Python, without using any imported libraries, use the built-in
sort()
list method and
sorted()
function.
By using the built-in
sort
list method you can mutate the original list to your desired requirements, whereas by using the
sorted
function it will return a new sorted 2D list.
Another popular way of sorting items in Python is using functions in the
lambda
function as seen in the example where I
sort items in a list based on their string length
.