How does the
int()
function work I. Python and could you write your own function?
The
int(x, base=10)
function in Python takes two parameters: the first parameter
x
being either a number or string and the second representing the
base
number to return (
10
being default which represents the decimal number system) and converts
x
into a whole integer number.
A simple example of converting a string is demonstrated below:
>>> my_string = "1234" >>> int(my_string) 1234
You can further test the conversion has worked properly by performing a simple mathematical operation like multiplying by 2:
>>> my_string = "1234"
>>> my_string * 2
'12341234'
>>> int(my_string) * 2
2468
As you can see from the above code if you multiply a string you get the string repeated
x
number of times, but if you get multiply a number then you should get a correct numerical result.
Another simple example demonstrating the conversion of a decimal number is as follows:
>>> my_float = 1234.56
>>> int(my_float)
1234
As you can see from the above code the
int()
function truncates the decimal portion of a
float
number.
How Does
int()
Work?
Several decisions have already been made for you when using the built-in
int()
function. What if these design decisions don’t match your expectations and you need something different?
Understanding how the
int()
function works helps in being able to design your own should you need something different.
One way to get a better understanding of the
int()
function is to copy how it works, then you can change your copied design to match your own expectations.
The first thing I would do with the first parameter
x
is to convert it into a string. By converting the parameter to a string the rest of the code would be easier to handle as you would be dealing with one data type.
Operating on the string would then require starting from the end of the string and by parsing through each character in the string checking the characters’ ordinal position.
If the ordinal number of the character is within the range of the ordinal numbers of the digits from 0 to 9 then we have a string that can be converted to a number .
To find out the ordinal number of a character use the built-in
ord(char)
function
which takes only one parameter:
a string character
.
For example, the ordinal number of the character
'a'
is
97
. The ordinal number of the character
'1'
is
49
.
>>> ord('a')
97
>>> ord('1')
49
All the numerical digits from 0 to 9 are represented by the ordinal numbers from
48
to
57
respectively.
Custom
int()
Alternative
To begin creating your own custom replacement of the built-in
int()
function you would need to loop through each of the characters in the original string and in reverse then calculate their corresponding number.
Finally, to position the numbers correctly they would need to be raised to the base
10
(or whatever base you input) and then accumulated to give the final result as a number.
Here’s how I tackled this problem with my own custom
int()
function:
def my_int(x, base = 10): x = str(x) index = 0 result = 0 for char in x[::-1]: o = ord(char) - 48 if base > o >= 0: result += (base ** index) * o index += 1 if char == "-": result *= -1 return result
So what is happening with the above custom function
my_int()
?
Firstly, the custom function takes two parameters:
x
the string or number to change and the
base
number used to convert the digits. The default
base
number is
10
which represents the decimal number system.
Once inside the function, there are a few declarations. The first is to make sure the data type of the first parameter is an actual string so the built-in
str()
method is used.
Next, I define the
index
and
result
variables as these values will increment and accumulate throughout the for loop with each character.
Next in the for loop that will loop through each character in the string, I use the
slice operator
[::-1]
to reverse the string so that I can start at the last character and work to the front.
Within the for loop, a simple calculation is performed on the difference between the
character’s ordinal number
and the ordinal number for zero – being
48
. This calculation will produce the actual digit as a number.
The if condition then checks the result from the difference in the ordinal numbers is less than the base and greater than or equal to zero. This ensures no character or number outside the base range is processed.
If the condition is true the next calculation needed is to raise the base to the index power and to multiply that number by the actual digit. Once this is done the index is incremented by one.
To demonstrate this calculation here is what the
result
variable looks like at each successful iteration:
(10 ** 0) * 4 = 4
(10 ** 1) * 3 = 30
(10 ** 2) * 2 = 200
(10 ** 3) * 1 = 1000
result = 1234
The last if condition checks for a negative sign and if it does then it multiplies the result by negative 1.
Trying this function in the wild produces the following results for this handful of tests:
>>> my_int('1,234')
1234
>>> my_int('$1,234')
1234
>>> my_int('01234')
1234
>>> my_int('1234.56')
123456
As you can see from the results it does a great job of removing unnecessary characters such as dollar signs and thousands separators (because the standard
int()
function does not!), but it looks like it needs help when operating with decimals.
How To Handle Decimals
As shown previously the current implementation of the
int()
function truncates the decimal portion.
To truncate the decimal portion an identifier is needed in the parameters of the function to determine
what the decimal character is
, by default it should be set to your country’s locale, mine will be set to
"."
. Besides this another minor change needed will be on the for loop and an additional portion of code is to be added, but overall the change to the custom
my_int()
function is fairly simple.
Here is what the custom code would look like:
def my_int(x, base = 10, decimal_char = "."): x = str(x) index = 0 result = 0 for idx, char in enumerate(x[::-1]): o = ord(char) - 48 if base > o >= 0: result += (base ** index) * o index += 1 if char == "-": result *= -1 if char == decimal_char: return my_int(x[:-idx-1], base, decimal_char) return result
The main additional piece of code is seen in the second
if
condition within the for loop. Here I check to see if the current character in the for loop matches the newly inserted third parameter
decimal_char
and if it does then I know I have the decimal portion all that is needed is to start again.
This is why the function is run again with the decimal portion removed.
Here’s how the result of this function turned out:
>>> my_int(1234.56)
1234
>>> my_int('1234.99')
1234
>>> my_int('US$1,234.50')
1234
>>> my_int("-$1,234.50")
-1234
The custom
int()
function works as anticipated and has helped to handle thousands separators, negative signs, and characters that should be removed but not hinder the conversion process.
Summary
The standard
int()
function converts a string or number to a whole number including any single negative sign. The
int()
function also truncates any decimal portion from a number.
To design something similar that would require more functionality on handling characters that shouldn’t prevent conversion (such as a currency symbol or thousands separator) then a custom function will be needed.
The resulting custom function I designed that handled this was the following:
def my_int(x, base = 10, decimal_char = "."): x = str(x) index = 0 result = 0 for idx, char in enumerate(x[::-1]): o = ord(char) - 48 if base > o >= 0: result += (base ** index) * o index += 1 if char == "-": result *= -1 if char == decimal_char: return my_int(x[:-idx-1], base, decimal_char) return result