How can you extract a date from a string in a cell in Google Sheets using the powerful `REGEXEXTRACT()`

function?

The `REGEXEXTRACT(text, regular_expression)`

function has two parameters with the first labelled as `text`

being the string operated on and the second labelled as `regular_expression`

being the regular expression (using RE2 syntax) to extract data from.

Here are a few popular examples of how you can use this powerful function to extract dates from a string in Google Sheets.

## Extract D/M/YYYY, DD/MM/YYYY, M/D/YYYY or MM/DD/YYYY

Recently I had a case where a cell contained a long string that had dates in the format `D/M/YYYY`

or `DD/MM/YYYY`

, for example, `1/6/2022`

or `12/10/2022`

.

By using the `REGEXEXTRACT()`

function I was able to extract the numbers from the string according to what they represented and then inserted them into the `DATE()`

function to obtain the desired result of a date cell.

The resulting formula was as follows:

A | |
---|---|

1 | This is a long string with a date at the end 1/10/2022 |

2 | 01/10/2022`=DATE(REGEXEXTRACT(A1,"/(\d{4})"), REGEXEXTRACT(A1, "\s(\d{1,2})/"), REGEXEXTRACT(A1, "/(\d{1,2})/"))` |

`REGEXEXTRACT`

to create a valid `DATE`

value from stringFrom the above example there are three distinct `REGEXEXTRACT`

functions that are extracting from the original string in `A1`

the needed value according to the three parameters of the `DATE()`

formula.

The first parameter of the `DATE`

formula is the year which in the original string is exactly 4 digits long and in this example is found after a forward slash character `/`

. Therefore to capture this using the regular expression `"/(\d{4})"`

means finding a forward slash `/`

start capture `(`

where four digits follow `/d{4}`

and then close capture `)`

.

The second parameter of the `DATE`

formula is the **month** which in the original string is found as the first number that has a trailing forward slash `/`

. To capture this number from the string the regular expression `"\s(\d{1,2})/"`

is used which reading it from left to right means: find a space character `\s`

then start capture `(`

of a digit containing 1 or 2 numbers `\d{1,2}`

then close capture `)`

and number must have a trailing slash `/`

.

The third parameter of the `DATE`

formula is the **day** which in the original string is found between the two forward slashes `/`

. Therefore to capture this using the regular expression `"/(\d{1,2})/"`

means to find a forward slash `/`

and to start capture `(`

where there are digits of 1 or 2 in length `\d{1,2}`

then close capture `)`

and number must have a trailing forward slash after it `/`

.

If your date is in the format `DD/MM/YYYY`

or `D/M/YYYY`

then you could swap the second and third parameters to obtain the same result.

Here is the formula to extract dates in the format `M/D/YYYY`

or `MM/DD/YYYY`

:

`=DATE(REGEXEXTRACT(A1,"/(\d{4})"), REGEXEXTRACT(A1, "\s(\d{1,2})/"), REGEXEXTRACT(A1, "/(\d{1,2})/"))`

Here is the formula to extract dates in the format `D/M/YYYY`

or `DD/MM/YYYY`

:

`=DATE(REGEXEXTRACT(A1,"/(\d{4})"), REGEXEXTRACT(A1, "/(\d{1,2})/"), REGEXEXTRACT(A1, "\s(\d{1,2})/"))`

## Extract YYYYMMDD

How do you extract from a string the date value in the format `YYYYMMDD`

?

In the same way `REGEXEXTRACT`

was used above to extract the digits from a string the same function can be used to extract the required values from a date in the format `YYYYMMDD`

.

Here’s a demonstration of how this works in a Google Sheet:

A | |
---|---|

1 | This is a long string with a date of 20220110 |

2 | 01/10/2022`=DATE(REGEXEXTRACT(A1,"(\d{4})\d{4}"), REGEXEXTRACT(A1, "\d{4}(\d{2})\d{2}"), REGEXEXTRACT(A1, "\d{6}(\d{2})"))` |

`REGEXEXTRACT`

to create a valid `DATE`

value from string where format of the date is `YYYYMMDD`

As you can see the same `DATE`

and `REGEXEXTRACT`

functions are used to grab the necessary numbers from the string.

The first `REGEXEXTRACT`

contains the regular expression `(\d{4})\d{4}`

which means capture the first four digits where another four digits follow – which means I’m capturing the year part of the `YYYYMMDD`

formatted date.

The second `REGEXEXTRACT`

contains the regular expression `\d{4}(\d{2})\d{2}`

which means find four digits then start capture of two digits that are then followed by two digits. This is how to extract the month from the `YYYYMMDD`

format.

The third and last `REGEXEXTRACT`

contains the regular expression `\d{6}(\d{2})`

which means find 6 digits then capture the last two digits. This is how to extract the day of the month component from the string that is formatted `YYYYMMDD`

.

Here is the formula again:

`=DATE(REGEXEXTRACT(A1,"(\d{4})\d{4}"), REGEXEXTRACT(A1, "\d{4}(\d{2})\d{2}"), REGEXEXTRACT(A1, "\d{6}(\d{2})"))`

## Summary

The `REGEXEXTRACT`

is a powerful function in Google Sheets that can easily enable you to capture dates from strings.

Whether the format of the date in the string is `YYYYMMDD`

or `MM/DD/YYYY`

or `D/M/YYYY`

as demonstrated above each date format can easily be captured using the `REGEXEXTRACT`

formula.