How To Clean And Format Phone Numbers In Google Sheets

If you’re working with data in Google Sheets you’ll soon come across a time when you will need to clean and format phone number data entries.

To clean phone numbers in Google Sheets using the ``` REGEXEXTRACT() ``` function extract all the different fields according to the phone number entries and then combine them all into the desired format.

Here’s an example walking step by step through the process of cleaning the data set first, before then moving on to creating the desired phone number format.

Survey Current Phone Number Formats

Suppose I had the following USA phone number data set:

As you can see from the example data set the data isn’t clean. Anyone and everyone who was entering phone number data into this field was entering the data according to their own personal taste.

To clean this data set up I’m going to use the ``` REGEXEXTRACT() ``` formula to help extract the phone groups that should have been in each cell, the groups are as follows:

• Plus symbol and a one to three-digit country code (1 being USA)
• Area code of three digits, usually in brackets
• Three digits representing the prefix number
• Four digits representing the subscriber number

With this format in mind, you can represent the above as a regular expression like so:

``\+?\d{1,3}?\D?\(?\d{3}?\)?\D?\d{3}\D?\d{4}``

Let’s break the regular expression apart so that you can understand how each expression corresponds to the phone number format:

Depending upon your phone number data set you may need to remove sections to match your structure. For example, if your phone numbers don’t contain the country code then you will not need the expression ``` \+?\d{1,3}?\D? ``` or if you don’t have parentheses around your area code then you can remove the ``` \(? ``` and ``` \)? ``` sections from the regular expression.

Without a country code or parentheses around an area code your regular expression could look something as simple as this instead:

``\d{3}?\D?\d{3}\D?\d{4}``

Capture Country, Area, Prefix And Subscriber Sections From Phone Number

The final aspect needed before inserting the above regular expression into the ``` REGEXEXTRACT() ``` formula is to set what you want to capture from the expression.

In my example, I want to capture all four sections from the phone number fields: the country code, area code, prefix and subscriber numbers. To do this I need to set the captures in the regular expression and the way this is done is by wrapping the expressions in parentheses.

Using my original regular expression here’s how capturing the country code, area code, prefix and subscriber sections would look:

``\+?(\d{1,3})?\D?\(?(\d{3})?\)?\D?(\d{3})\D?(\d{4})``

Notice the newly inserted parentheses (highlighted in bold).

When this regular expression is inserted into your Google Sheet phone number data, in an adjacent column, with the ``` REGEXEXTRACT(A1, "\+?(\d{1,3})?\D?(?(\d{3})?)?\D?(\d{3})\D?(\d{4}) `````` ) ``` formula it produces the following result:

As you can see the regular expression extracts each section from the phone numbers in column A and inserts their section into the 4 columns alongside each column representing one of the captured sections.

Concatenate Phone Number Into Custom String

To translate this into your custom phone number format it would now be a case of adding another column and where a value doesn’t exist in a column to replace it with what should have been inserted.

For example, if these numbers are all USA phone numbers then for column B we could insert the string ``` "+1" ``` as a substitute (as 1 represents the country code for USA); if column B is meant to be a local area code of ``` 555 ``` then this would be the substituted value should that column be blank .

Therefore, wrapping this all up into a formula and applying spacing between each section you could insert the following formula in column F:

``=CONCATENATE("+",IF(LEN(B1),B1,"1")," (",IF(LEN(C1),C1,"555"),") ",D1," ",E1)``

Here is the result in our working spreadsheet:

As you can see the ``` CONCATENATE() ``` function has allowed you to design your own custom phone number format and the result is a neatly designed string from the “dirty data” of phone numbers in column A.

Phone Number As Custom Number Format

An alternative to constructing your own custom phone number string using the ``` CONCATENATE() ``` formula from the result of the ``` REGEXEXTRACT() ``` function is to create a phone number as a number and to apply a custom number format for that column.

This may prove handy should the data be issued to others and they prefer different formats to phone numbers.

First step with this approach is to apply the same concatenation technique but to convert it to a number using the ``` VALUE() ``` formula. This would look something like this in column F instead:

``=VALUE(CONCATENATE(IF(LEN(B1),B1,"1"),IF(LEN(C1),C1,"555"),D1,E1))``

Notice the formula has a slight difference from above: the removal of the ``` "+" ``` country code prefix, spacing and parentheses. Removing these elements will help make it easier for the ``` VALUE() ``` function to translate the string to a number, if you do get an error check these elements have been removed from your ``` CONCATENATE() ``` function.

The result of the above formula would now look something like this in your spreadsheet:

As you can see the result of this combination of formulas produces one large number.

Change Phone Number Format Using Format Menu

To change a number into a phone number format you can use the ``` Format ``` menu item and navigate to Number and then Custom Number Format .

The modal window that pops up will then need a format for how you would like to display the number. If you stick with the common format of ``` +# (###) ### ### ``` then you would type into the input box next to the Apply button that format, such that it looks like this:

The ``` # ``` symbol in the Custom Number Format field represents a number.

Once you apply this to your column with phone numbers you will see column F change to this:

Then should a user prefer a different format they can apply their own Custom Number Format field accordingly.

For example, if a user prefers square brackets and a hyphen between the prefix and subscriber numbers then their format will look something like this:

``+# [###] ###-####``

Summary

Cleaning phone numbers in a spreadsheet, like Google Sheets, is made easier with the handy ``` REGEXEXTRACT() ``` function. This function enables users to apply one regular expression to capture all the phone number entries and then by using ``` CONCATENATE() ``` can then change these captures into a constant phone number format.

Check our other post on this topic where I explore how you can use Python and regular expressions to check phone numbers too.

Ryan Sheehy
Ryan has been dabbling in code since the late '90s when he cut his teeth exploring VBA in Excel. Having his eyes opened with the potential of automating repetitive tasks, he expanded to Python and then moved over to scripting languages such as HTML, CSS, Javascript and PHP.