REGEXEXTRACT Date From String In Google Sheets

How can you extract a date from a string in a cell in Google Sheets using the powerful REGEXEXTRACT() function?

The REGEXEXTRACT(text, regular_expression) function has two parameters with the first labelled as text being the string operated on and the second labelled as regular_expression being the regular expression ( using RE2 syntax ) to extract data from.

Here are a few popular examples of how you can use this powerful function to extract dates from a string in Google Sheets.

Extract D/M/YYYY, DD/MM/YYYY, M/D/YYYY or MM/DD/YYYY

Recently I had a case where a cell contained a long string that had dates in the format D/M/YYYY or DD/MM/YYYY , for example, 1/6/2022 or 12/10/2022 .

By using the REGEXEXTRACT() function I was able to extract the numbers from the string according to what they represented and then inserted them into the DATE() function to obtain the desired result of a date cell.

The resulting formula was as follows:

A
1 This is a long string with a date at the end 1/10/2022
2 01/10/2022
=DATE(REGEXEXTRACT(A1,"/(\d{4})"), REGEXEXTRACT(A1, "\s(\d{1,2})/"), REGEXEXTRACT(A1, "/(\d{1,2})/"))
Using REGEXEXTRACT to create a valid DATE value from string

From the above example there are three distinct REGEXEXTRACT functions that are extracting from the original string in A1 the needed value according to the three parameters of the DATE() formula.

The first parameter of the DATE formula is the year which in the original string is exactly 4 digits long and in this example is found after a forward slash character / . Therefore to capture this using the regular expression "/(\d{4})" means finding a forward slash / start capture ( where four digits follow /d{4} and then close capture ) .

The second parameter of the DATE formula is the month which in the original string is found as the first number that has a trailing forward slash / . To capture this number from the string the regular expression "\s(\d{1,2})/" is used which reading it from left to right means: find a space character \s then start capture ( of a digit containing 1 or 2 numbers \d{1,2} then close capture ) and number must have a trailing slash / .

The third parameter of the DATE formula is the day which in the original string is found between the two forward slashes / . Therefore to capture this using the regular expression "/(\d{1,2})/" means to find a forward slash / and to start capture ( where there are digits of 1 or 2 in length \d{1,2} then close capture ) and number must have a trailing forward slash after it / .

If your date is in the format DD/MM/YYYY or D/M/YYYY then you could swap the second and third parameters to obtain the same result.

Here is the formula to extract dates in the format M/D/YYYY or MM/DD/YYYY :

=DATE(REGEXEXTRACT(A1,"/(\d{4})"), REGEXEXTRACT(A1, "\s(\d{1,2})/"), REGEXEXTRACT(A1, "/(\d{1,2})/"))

Here is the formula to extract dates in the format D/M/YYYY or DD/MM/YYYY :

=DATE(REGEXEXTRACT(A1,"/(\d{4})"), REGEXEXTRACT(A1, "/(\d{1,2})/"), REGEXEXTRACT(A1, "\s(\d{1,2})/"))

Extract YYYYMMDD

How do you extract from a string the date value in the format YYYYMMDD ?

In the same way REGEXEXTRACT was used above to extract the digits from a string the same function can be used to extract the required values from a date in the format YYYYMMDD .

Here’s a demonstration of how this works in a Google Sheet:

A
1 This is a long string with a date of 20220110
2 01/10/2022
=DATE(REGEXEXTRACT(A1,"(\d{4})\d{4}"), REGEXEXTRACT(A1, "\d{4}(\d{2})\d{2}"), REGEXEXTRACT(A1, "\d{6}(\d{2})"))
Using REGEXEXTRACT to create a valid DATE value from string where format of the date is YYYYMMDD

As you can see the same DATE and REGEXEXTRACT functions are used to grab the necessary numbers from the string.

The first REGEXEXTRACT contains the regular expression (\d{4})\d{4} which means capture the first four digits where another four digits follow – which means I’m capturing the year part of the YYYYMMDD formatted date.

The second REGEXEXTRACT contains the regular expression \d{4}(\d{2})\d{2} which means find four digits then start capture of two digits that are then followed by two digits. This is how to extract the month from the YYYYMMDD format.

The third and last REGEXEXTRACT contains the regular expression \d{6}(\d{2}) which means find 6 digits then capture the last two digits. This is how to extract the day of the month component from the string that is formatted YYYYMMDD .

Here is the formula again:

=DATE(REGEXEXTRACT(A1,"(\d{4})\d{4}"), REGEXEXTRACT(A1, "\d{4}(\d{2})\d{2}"), REGEXEXTRACT(A1, "\d{6}(\d{2})"))

Summary

The REGEXEXTRACT is a powerful function in Google Sheets that can easily enable you to capture dates from strings.

Whether the format of the date in the string is YYYYMMDD or MM/DD/YYYY or D/M/YYYY as demonstrated above each date format can easily be captured using the REGEXEXTRACT formula.

Photo of author
Ryan Sheehy
Ryan has been dabbling in code since the late '90s when he cut his teeth exploring VBA in Excel. Having his eyes opened with the potential of automating repetitive tasks, he expanded to Python and then moved over to scripting languages such as HTML, CSS, Javascript and PHP.