A common requirement in Python is to split a string into the characters which make up the string. I’ve previously shown how you can do this by breaking up a word into separate cells using a spreadsheet , but how do you do this in Python?
As a string is a data type that can be iterated it means each unit element within the string, being a character, can be referencing an index on the string.
For example:
>>> my_string = "How long is a piece of string?"
>>> my_string[0]
'H'
Knowing that you can reference parts of a string in the same way you can reference elements in a list, you should be able to see how you can loop through a string, extract each element and insert it into its own list.
>>> my_string = "How long is a piece of string?"
>>> character_list = []
>>> for idx in range(len(my_string)):
... letter = my_string[idx]
... character_list.append(letter)
...
>>> print(character_list)
['H', 'o', 'w', ' ', 'l', 'o', 'n', 'g', ' ', 'i', 's', ' ', 'a', ' ', 'p', 'i', 'e', 'c', 'e', ' ', 'o', 'f', ' ', 's', 't', 'r', 'i', 'n', 'g', '?']
Let’s break up the code above and explain what has just happened:
After defining the string the first thing to do is to create a list variable where each character from the original string will be inserted, this is seen in the line:
character_list = []
.
Next, create a for loop that iterates through the entire range of the string. The
range()
function can take a single parameter and by doing so it sets the upper limit of how far you would need to go. The easiest way of being able to determine the
length of something such as a list
or string is to use the built-in function
len()
.
Therefore, the combination of the
range()
function with the
len()
function sets up the index numbers needed to loop through all the characters in the string. By placing this combination in a for loop and assigning the index number to the variable
idx
you can now begin to retrieve each character.
Inside the for-loop the variable
letter
is assigned to the index position of
idx
in the original string.
Finally, the
letter
variable is appended to the character list to capture all the characters in the original string. To show the result of the entire operation you can
print the list
.
One-Liner
This entire process can be further compressed into one line using list comprehensions . Here’s what it would look like:
>>> my_string = "How long is a piece of string?"
>>> [my_string[idx] for idx in range(len(my_string))]
['H', 'o', 'w', ' ', 'l', 'o', 'n', 'g', ' ', 'i', 's', ' ', 'a', ' ', 'p', 'i', 'e', 'c', 'e', ' ', 'o', 'f', ' ', 's', 't', 'r', 'i', 'n', 'g', '?']
How amazing is that list comprehension one-liner?
It achieves the result intended and uses the same concepts taught from the longer form above. The main for loop is the same in both instances, but the
letter
variable has now moved to the front of the list comprehension, and instead of appending each
letter
variable to an existing
character_list
list it is wrapped all in a list.
The output could have been captured in a variable like
character_list
but as the result was to achieve the same output these lines were skipped.
Summary
To get each character from an original string into a list use the list comprehension technique as follows:
[my_string[idx] for idx in range(len(my_string))]
where
my_string
is a variable referencing the string you want to break into the character list.